How to implement routing with n-ary tree
Routing mechanism is one of the main parts of a web service. Here I provide one approach to achieve this. This implementation is based on n-ary tree and tree search.
Many ways are available to handle routes, from regex to n-ary trees to parse registered route one by one.
Different ways
N-ary Trees
In this approach we can consider the root node as /
and all children nodes as
route parts are separated by /
. Every part is on a separated level of the
tree. A node contains one route part.
A route part can be either a string which requires full match or a path param
which is a string surrounded by curly brackets ({}
) also acceptable.
Regex
All defined routes go into a stack in the order of definition. During procession the earlier match is executed or returns the result of the executed lambda.
Parse one by one
This approach is very similar to the above one but without regex. Let us consider a registered route as a key in a dictionary. The value is the lambda which is intended to be executed when call falls on the route.
After iterating over the keys of the dictionary either use regex to identify the appropriate route or parse them and process by using string operations (for those who are not into regex).
Rule objects
Tornado framework provides a fine-grained mechanism to define and implement routings. The framework introduced rules and delegators parse and execute expressions.
The Approach
In this solution the n-ary tree approach is chosen. Each node contains one part
of the route, depth of the tree equals to the longest route splitted by /
.
All children of a node contain the next route part after the node value
separated by /
.
Register a route
For route registration we can use the register
function. It takes the route
itself, which will be parsed, splitted and considered as an identifier part.
The whole identifier of a node is the path inside the tree from the root to the
given node. Method is also part of the identifier that means the same route with
different methods are placed in two different nodes as siblings.
params
are considered as a whitelist of queryparams. Those queryparams that
are not in this list, those are not parsed and processed, and passed to the
lambda.
routing.register(route='/users',
callback=(lambda name, email: users.create_user(name, email)),
method=routing.Methods.POST,
params={'name', 'email'})
routing.register(route='/users/{id}',
callback=(lambda id: users.get_user(id)),
method=routing.Methods.GET)
Process a route
The process
function takes the three agruments as shown below and returns the
result of the executed lambda. If lambda returns void than process gives back
None
.
request_body = routing.process(event.get('path'),
event.get('queryStringParameters'),
event.get('httpMethod'))
Routing Tree
Routing tree represents the registered routes. Each node contains the route part, the full path from the root to itself, callbacks and children nodes.
Callbacks are stored in a dictionary with methods as keys and lambdas as values.
So this way one path could have a POST
and a GET
or any other methods at the
same time.
A node looks like as follows:
(value="/" resource_path='/' callbacks={} children: [
(value="users" resource_path='/users' callbacks={'GET': {callback=
lambda-expression,
params={}}} children: [])
])
The routing tree has two methods that can build the tree up or find the appropriate path.
- add: responsible to handle the registration recursively.
- find: responsible for evaluation of the call.
Both methods traverse the tree recursively, finding matching parts. If parts found, and no other path left, it returns the appropriate value, or it goes towards to one of the children if necessary.
Add route with pathparam
Pathparam is also part of the representation so it goes as a node value, like
{id}
.
This implementation requires that no multiple pathparam names are allowed.
That means no {id}
and {user_id}
are allowed at the same time.
Process a call
Routing trees find
method is responsible for getting back the lambda call
based on the route path and the method. If one of them are differ from what the
tree contains, find
raise an error accordingly.
At first, find
checks wether the path on a given node could match. If so, it
traverse through children searching for exact matching. If no matching found on
the level, it considers the path as a pathparam and goes towards the children of
the pathparam node.
Validation
register
as well as process
have their own validators against injections and malicious
characters that can mess up with the database.
Here I used very simple validation by using regex, but it is alterable according to the needs.
PATH_PARAM_PATTERN = '^{([a-zA-Z0-9]+)}$'
PATH_PART_PATTERN = '^[a-zA-Z0-9\-]+$'
QUERY_KEY_PATTERN = '^[a-zA-Z0-9]+$'
QUERY_VALUE_PATTERN = '(^[a-zA-Z0-9\-@\. ]+$)'
The implementation gives a chance to fine-tune the validation by introducing
param specific patterns. These could be placed in either validate_route_register
or
validate_route_process
or in both.
What it does not contain
It is an experimental project so it lacks some other features. The implementation of these features does not require much alteration on the code, though I leave it for others. The list below.
- Process request body and send data as JSON
- Optional params for
PATCH
- Validate params to avoid updating id
- Context param of handler func
- Builder for response (which can be found in my other post)
In this post we saw one approach of implementing routing in python, how to declare route with method, params and lambda, and took a look on how to process it.
For further reading and for the implementation you can visit https://github.com/torokmark/routing_python.