Implementing custom array in Python
Python provides basic datastructures like dictionary, list, set, and tuple. If we need another datastruct which is different from the mentioned three, we have to implement ourones.
As I started using python in my daily work, more and more questions had cropped up about the language and its refinement. One of them was to implement a custom array.
What we would like to achieve is to have a datastruct which behaves like an array. That means, it is:
- iterable: iteration happens over
for
loop - subscriptable:
[:]
notation can be applied on it - support indexing: can support index notation,
[n]
, for both reading and writing - length:
len
built-in function returns the number of elements of the array
The Goal
The goal is to have all the abovementioned functionalities but do not have others that make representation available for users outside of the scope of the class.
The test is found below. So we want to have all of them like iter
, len
,
[n]
, or [:]
but not want to have extend
or other list related methods.
# test_composits.py
from expects import *
from composites import Composite
arr = Composite()
arr[0] = 1
arr[1] = 2
arr[2] = lambda x: x+2
arr[3] = 'a'
arr[4] = {}
arr[7] = Composite
expect(len(arr)).to(equal(8))
expect(lambda: iter(arr)).not_to(raise_error(TypeError))
expect(lambda: [i for i in arr]).not_to(raise_error(TypeError))
expect(lambda: next(arr)).to(raise_error(TypeError))
expect(lambda: len(arr)).not_to(raise_error(TypeError))
expect(lambda: arr[2]).not_to(raise_error(TypeError))
expect(arr[1]).to(equal(2))
expect(lambda: arr[:]).not_to(raise_error(TypeError))
expect(arr[0:2]).to(equal([1, 2]))
expect(lambda: arr.extend([1, 2])).to(raise_error(AttributeError))
expect(lambda: arr.__array).to(raise_error(AttributeError))
The Inherited
The most straightforward way would be to implement a custom array that is
inherited from list
. The only drawback with this is that we get more
functions than we want to, so extend
is also part of the class.
Since all methods come that list
possesses, our test code fails.
class Inherited(list):
pass
The Composed
The composition over inheritance design principle is a well-knonw approach if we want to achieve a behaviour that we mentioned above.
With composition we can hide undesired functionalities of an object and provide just those that we would like to publish to outside.
The tests are the same, so let us see what we have to implement to get a full functional object with the covated behaviour.
1. The representation
The representation inside is a list, but this won’t be available from outside.
class Composite:
def __init__(self):
self.__array = []
2. Length
To make len(a)
available on our custom object, the __len__
method should be
implemented.
def __len__(self):
return len(self.__array)
3. Index and subscription
To support indexing or subscription it is enough to implement __getitem__
method.
To make a support on for
loop, class has to implement either __getitem__
or
__iter__
methods.
def __getitem__(self, key):
return self.__array[key]
To support item assignment, __setitem__
method has to be implemented as
well. The simplest approach would be self.__array[key] = value
. But what if we
would like to assign a value to an index where index is not initialized
beforehand.
Here the type of the index is not considered as anything else just integer
def __setitem__(self, key, value):
try:
self.__array[key] = value
except IndexError:
assert key >= 0, 'key must be 0 or positive'
self.__array.extend(((key + 1) - len(self.__array)) * [None])
self.__array[key] = value
4. Iterator
__getitem__
method is like a good army-knife, it could solve almost anything
we needed here.
Let us see what we have in the instance of the class, if we call dir()
built-in function, it prints an array of methods of the instance. __iter__
is
not in there, though we can provide an iterator object.
As mentioned above, for
loop iterates over the instance, but to get the
iterator object, we have to call iter()
built-in on it. By using the iterator
object, next()
built-in is also available to get the next value.
arr_it = iter(arr)
print(next(arr_it)) # => 1
for i in arr:
if i == None:
break
print(i) # => 1, 2, <iter....> , ...
print(next(arr_it)) # => 2
Be careful, if you want to make your class to work as an iterator.
To do this, the class has to implement __iter__
and __next__
methods.
This case, for
loop uses __next__
for stepping instead of __getitem__
.
The drawback of this approach is when break
statement jumps out of the loop
and iteration stops at one point, the next for
loop starts the iteration from
that point instead of starting it from the beginning. An example is below:
class ArrayIt:
def __init__(self, values = []):
self.array = values
self.current = 0
def __getitem__(self, key):
return self.array[key]
def __iter__(self):
return self
def __next__(self):
try:
self.current += 1
return self.array[self.current - 1]
except IndexError:
self.current = 0
raise StopIteration
it = ArrayIt([1, 2, 3, 4, 5, 6, 7])
print(it[2]) # => 3
for i in it:
if i == 5:
break
print(i) # => 1, 2, 3, 4
for i in it:
print(i) # => 6, 7
So once again, having a class that implements __getitem__
(indexable) and
__next__
along with __iter__
(iterator) next to it at one time brings this
behaviour.
The Full Code
The full implementation of our array class without it being an iterator is below:
class Composite:
def __init__(self):
self.array = []
def __len__(self):
return len(self.array)
def __getitem__(self, key):
return self.array[key]
def __setitem__(self, key, value):
try:
self.array[key] = value
except IndexError:
assert key >= 0, 'key must be 0 or positive'
self.array.extend(((key + 1) - len(self.array)) * [None])
self.array[key] = value
Source codes can be found under here: https://github.com/torokmark/python_custom_array
Documentation can be found here: