Datalookup documentation
The Datalookup library makes it easier to filter and manipulate your data. The module is inspired by the Django Queryset Api and it’s lookups.
Note
This documentation will copy some information of the Django documentation.
All lookups
have the same name as in Django and some method’s name from the Queryset
are also the same. However it’s important to note that Datalookup is not Django.
It’s just a simple module for deep nested data filtering.
Installation
$ pip install datalookup
Example
Throughout the below examples (and in the reference), we’ll refer to the following data, which comprise a list of authors with the books they wrote.
data = [
{
"id": 1,
"author": "J. K. Rowling",
"books": [
{
"name": "Harry Potter and the Chamber of Secrets",
"genre": "Fantasy",
"published": "1998",
"sales": 77000000,
"info": {
"pages": 251,
"language": "English"
}
},
{
"name": "Harry Potter and the Prisoner of Azkaban",
"genre": "Fantasy",
"published": "1999",
"sales": 65000000,
"info": {
"pages": 317,
"language": "English"
}
}
],
"genres": [
"Fantasy",
"Drama",
"Crime fiction"
]
},
{
"id": 2,
"author": "Agatha Christie",
"books": [
{
"name": "And Then There Were None",
"genre": "Mystery",
"published": "1939",
"sales": 100000000,
"info": {
"pages": 272,
"language": "English"
}
}
],
"genres": [
"Murder mystery",
"Detective story",
"Crime fiction",
"Thriller"
]
}
]
Datalookup makes it easy to find an author by calling one of the methods
of the Dataset
class like filter()
or
exclude()
. There are multiple ways to retrieve an author.
Basic filtering
Use one of the field
of your author dictionary to filter your data.
Note
Datalookup consider a dictionary as a Node
. Each keys
of the
dictionary is converted to a Field
. Each dictionary can contain one or multiple
fields. Some fields are considered ValueField
and others RelatedField
.
Those fields are what will help us filter your dataset.
from datalookup import Dataset
# Use Dataset to manipulate and filter your data. We assume for the
# next examples that this line will be added.
books = Dataset(data)
assert len(books) == 2
# Retrieve an author using one of the field of the author.
# Something like 'id' or 'author'
authors = books.filter(author="J. K. Rowling")
assert len(authors) == 1
assert authors[0].author == "J. K. Rowling"
AND, OR - filtering
Keyword argument queries - in filter()
, etc. - are
“AND”ed together. If you need to execute more complex queries
(for example, queries with OR statements), you can combine two filter request with “|”.
# Retrieve an author using multiple filters with a single request (AND). This
# filter use the '__icontains' lookup. Same as '__contains' but case-insensitive
authors = books.filter(books__name__icontains="and", books__genre="Fantasy")
assert len(authors) == 1
assert authors[0].author == "J. K. Rowling"
# Retrieve an author by combining filters (OR)
authors = books.filter(author="Stephane Capponi") | books.filter(
author="J. K. Rowling"
)
assert len(authors) == 1
assert authors[0].author == "J. K. Rowling"
Cascade filtering
Sometimes you will want to filter the author but also the related books.
It is possible to do that by calling the on_cascade()
method before filtering.
# Filter the author but also the books of the author
authors = books.on_cascade().filter(
books__name="Harry Potter and the Chamber of Secrets"
)
assert len(authors) == 1
assert authors[0].author == "J. K. Rowling"
# The books are also filtered
assert len(authors[0].books) == 1
assert authors[0].books[0].name == "Harry Potter and the Chamber of Secrets"
Lookup filtering
You might have seen in the previous examples the use of lookups
to retrieve
authors or books. Here are a couple more examples:
# Use of the '__contains' lookup to look into the 'genres' fields
authors = books.filter(genres__contains="Fantasy")
# Use of the '__gt' lookup to get all authors that wrote a book with
# more than 'X' pages
authors = books.filter(books__info__pages__gt=280)
# Same as above but with '__range'. Find author that wrote a book
# with the numbers of pages between 'X' and 'y'
authors = books.filter(books__info__pages__range=(250, 350))
Dataset Api
Here’s the formal declaration of a Dataset:
- class Dataset(data: Union[dict, list])[source]
A Dataset is the entry point to manipulate and filter your data. Usually when you’ll interact with a Dataset you’ll use it by chaining filters. To make this work, most methods return new dataset. These methods are covered in detail later in this section.
Note
A Dataset will accept a dictionary for the data parameter. Bear in mind that if you use
values
on this kind of dataset, it will still return a list of dictionary.
Class methods
from_json()
from_nodes()
Methods that return new Dataset
filter()
- Dataset.filter(**kwargs)[source]
Returns a new Dataset containing objects that match the given filter parameters.
The filter parameters (
**kwargs
) should be in the format described in the Field lookups below. Multiple parameters will beAND
sed together
exclude()
- Dataset.exclude(**kwargs)[source]
Returns a new Dataset containing objects that do not match the given filter parameters.
The filter parameters (
**kwargs
) should be in the format described in the Field lookups below. Multiple parameters will beAND
sed together
distinct()
values()
on_cascade()
- Dataset.on_cascade()[source]
Must be followed by
filter()
,exclude()
or other filtering methods (like books.on_cascade().filter(…)). This method will not only filter the current dataset but also the related field dataset. Example:# Filter the author but also the books of the author authors = books.on_cascade().filter( books__name="Harry Potter and the Chamber of Secrets" ) assert len(authors) == 1 assert authors[0].author == "J. K. Rowling" # The books are also filtered assert len(authors[0].books) == 1 assert authors[0].books[0].name == "Harry Potter and the Chamber of Secrets"
Field lookups
Field lookups are used to specify how a the dataset should query the results it returns.
They’re specified as keyword arguments to the Dataset
methods
filter()
and exclude()
.
Basic lookups keyword arguments take the form “field__lookuptype=value”.
(That’s a double-underscore).
As a convenience when no lookup type is provided (like in
books.filter(id=1)
) the lookup type is assumed to be exact
.
exact
Exact match.
Examples:
books.filter(id__exact=1)
iexact
Case-insensitive exact match.
Example:
books.filter(author__iexact='j. k. rowling')
contains
Case-sensitive containment test. Value type can be a ‘list’
Example:
books.filter(books__name__contains='And')
books.filter(books__name__contains=['And', 'Potter'])
icontains
Case-insensitive containment test. Value type can be a ‘list’
Example:
books.filter(books__name__contains='and')
books.filter(books__name__contains=['and', 'potter'])
in
In a given iterable; often a list, tuple, or dataset. It’s not a common use case, but strings (being iterables) are accepted.
Examples:
books.filter(id__in=[1, 3, 4])
books.filter(author__in='abc')
gt
Greater than.
Example:
books.filter(id__gt=1)
gte
Greater than or equal to.
lt
Less than.
lte
Less than or equal to.
startswith
Case-sensitive starts-with.
Example:
books.filter(author__startswith='J.')
istartswith
Case-insensitive starts-with.
Example:
books.filter(author__istartswith='j.')
endswith
Case-sensitive ends-with.
Example:
books.filter(books__name__endswith='Azkaban')
iendswith
Case-insensitive ends-with.
Example:
books.filter(books__name__endswith='azkaban')
range
Range test (inclusive).
Example:
books.filter(books__info__pages__range=(250, 350))
isnull
Takes either True
or False
, which correspond to None in Python.
Example:
books.filter(books__sales__isnull=True)
regex
Case-sensitive regular expression match. This feature is provided by a
(Python) user-defined REGEXP function, and the regular expression syntax
is therefore that of Python’s re
module.
Example:
books.filter(author__regex=r'.*Row.*')
iregex
Case-insensitive regular expression match.
Example:
books.filter(author__regex=r'.*row.*')
ArrayFields lookups
There are special lookups for ArrayField like:
"genres": [
"Fantasy",
"drama",
"crime fiction"
]
contained_by
This is the opposite of the contains
lookup - the objects returned
will be those where the data is a subset of the values passed. For example:
authors = books.filter(
genres__contained_by=["Fantasy", "Drama", "Crime fiction"]
)
assert len(authors) == 1
overlap
Returns objects where the data shares any results with the values passed.
authors = books.filter(genres__overlap=["Fantasy"])
assert len(authors) == 1
authors = books.filter(genres__overlap=["Fantasy", "Thriller"])
assert len(authors) == 2
len
Returns the length of the array. For example:
authors = books.filter(genres__len=3)
assert len(authors) == 1
assert authors[0].name == "J. K. Rowling"
Node class
Here’s the formal declaration of a Node:
- class Node(data: dict)[source]
A Node represent a dictionary where the value of each key is a specific Field. Right now exists two categories of Fields.
ValueField
andRelatedField
. The first one is used for - int, float, str. And the second one for dictionary and list of dictionary.Thanks to those fields we are able to filter and find every nodes that the user query.
Methods
filter()
- Node.filter(**kwargs)[source]
Returns the current
Node
or raise anObjectNotFound
exception.The filter parameters (
**kwargs
) should be in the format described in the Field lookups below. Multiple parameters will beAND
sed together