Implementing custom array in Python


Python provides basic datastructures like dictionary, list, set, and tuple. If we need another datastruct which is different from the mentioned three, we have to implement ourones.

As I started using python in my daily work, more and more questions had cropped up about the language and its refinement. One of them was to implement a custom array.

What we would like to achieve is to have a datastruct which behaves like an array. That means, it is:

  • iterable: iteration happens over for loop
  • subscriptable: [:] notation can be applied on it
  • support indexing: can support index notation, [n], for both reading and writing
  • length: len built-in function returns the number of elements of the array

The Goal

The goal is to have all the abovementioned functionalities but do not have others that make representation available for users outside of the scope of the class.

The test is found below. So we want to have all of them like iter, len, [n], or [:] but not want to have extend or other list related methods.

# test_composits.py
from expects import *

from composites import Composite

arr = Composite()

arr[0] = 1
arr[1] = 2
arr[2] = lambda x: x+2
arr[3] = 'a'
arr[4] = {}
arr[7] = Composite

expect(len(arr)).to(equal(8))

expect(lambda: iter(arr)).not_to(raise_error(TypeError))
expect(lambda: [i for i in arr]).not_to(raise_error(TypeError))
expect(lambda: next(arr)).to(raise_error(TypeError))

expect(lambda: len(arr)).not_to(raise_error(TypeError))
expect(lambda: arr[2]).not_to(raise_error(TypeError))
expect(arr[1]).to(equal(2))
expect(lambda: arr[:]).not_to(raise_error(TypeError))
expect(arr[0:2]).to(equal([1, 2]))
expect(lambda: arr.extend([1, 2])).to(raise_error(AttributeError))
expect(lambda: arr.__array).to(raise_error(AttributeError))

The Inherited

The most straightforward way would be to implement a custom array that is inherited from list. The only drawback with this is that we get more functions than we want to, so extend is also part of the class.

Since all methods come that list possesses, our test code fails.

class Inherited(list):
    pass

The Composed

The composition over inheritance design principle is a well-knonw approach if we want to achieve a behaviour that we mentioned above.

With composition we can hide undesired functionalities of an object and provide just those that we would like to publish to outside.

The tests are the same, so let us see what we have to implement to get a full functional object with the covated behaviour.

1. The representation

The representation inside is a list, but this won’t be available from outside.

class Composite:
    def __init__(self):
        self.__array = []

2. Length

To make len(a) available on our custom object, the __len__ method should be implemented.

def __len__(self):
    return len(self.__array)

3. Index and subscription

To support indexing or subscription it is enough to implement __getitem__ method.

To make a support on for loop, class has to implement either __getitem__ or __iter__ methods.

def __getitem__(self, key):
    return self.__array[key]

To support item assignment, __setitem__ method has to be implemented as well. The simplest approach would be self.__array[key] = value. But what if we would like to assign a value to an index where index is not initialized beforehand.

Here the type of the index is not considered as anything else just integer

def __setitem__(self, key, value):
    try:
        self.__array[key] = value
    except IndexError:
        assert key >= 0, 'key must be 0 or positive'
        self.__array.extend(((key + 1) - len(self.__array)) * [None])
        self.__array[key] = value

4. Iterator

__getitem__ method is like a good army-knife, it could solve almost anything we needed here.

Let us see what we have in the instance of the class, if we call dir() built-in function, it prints an array of methods of the instance. __iter__ is not in there, though we can provide an iterator object.

As mentioned above, for loop iterates over the instance, but to get the iterator object, we have to call iter() built-in on it. By using the iterator object, next() built-in is also available to get the next value.

arr_it = iter(arr)
print(next(arr_it)) # => 1
for i in arr:
    if i == None: 
        break
    print(i) # => 1, 2, <iter....> , ...

print(next(arr_it)) # => 2

Be careful, if you want to make your class to work as an iterator. To do this, the class has to implement __iter__ and __next__ methods.

This case, for loop uses __next__ for stepping instead of __getitem__. The drawback of this approach is when break statement jumps out of the loop and iteration stops at one point, the next for loop starts the iteration from that point instead of starting it from the beginning. An example is below:

class ArrayIt:
  def __init__(self, values = []):
    self.array = values
    self.current = 0

  def __getitem__(self, key):
    return self.array[key]

  def __iter__(self):
    return self

  def __next__(self):
    try:
      self.current += 1
      return self.array[self.current - 1]
    except IndexError:
      self.current = 0
      raise StopIteration

it = ArrayIt([1, 2, 3, 4, 5, 6, 7])
print(it[2]) # => 3
for i in it:
  if i == 5:
    break
  print(i) # => 1, 2, 3, 4

for i in it:
  print(i) # => 6, 7

So once again, having a class that implements __getitem__ (indexable) and __next__ along with __iter__ (iterator) next to it at one time brings this behaviour.

The Full Code

The full implementation of our array class without it being an iterator is below:

class Composite:
    def __init__(self):
        self.array = []

    def __len__(self):
        return len(self.array)

    def __getitem__(self, key):
        return self.array[key]

    def __setitem__(self, key, value):
        try:
            self.array[key] = value
        except IndexError:
            assert key >= 0, 'key must be 0 or positive'
            self.array.extend(((key + 1) - len(self.array)) * [None])
            self.array[key] = value

Source codes can be found under here: https://github.com/torokmark/python_custom_array

Documentation can be found here: