Faceting¶
Faceting is a way of grouping and narrowing search results by a common factor, for example we can group
all results which are registered on a certain date. Similar to More Like This, the faceting
functionality is implemented by setting up a special ^search/facets/$
route on any view which inherits from the
drf_haystack.mixins.FacetMixin
class.
Note
Options used for faceting is not portable across search backends. Make sure to provide options suitable for the backend you’re using.
First, read the Haystack faceting docs and set up your search index for faceting.
Serializing faceted counts¶
Faceting is a little special in terms that it does not care about SearchQuerySet filtering. Faceting is performed
by calling the SearchQuerySet().facet(field, **options)
and SearchQuerySet().date_facet(field, **options)
methods, which will apply facets to the SearchQuerySet. Next we need to call the SearchQuerySet().facet_counts()
in order to retrieve a dictionary with all the counts for the faceted fields.
We have a special drf_haystack.serializers.HaystackFacetSerializer
class which is designed to serialize
these results.
Tip
It is possible to perform faceting on a subset of the queryset, in which case you’d have to override the
get_queryset()
method of the view to limit the queryset before it is passed on to the
filter_facet_queryset()
method.
Any serializer subclassed from the HaystackFacetSerializer
is expected to have a field_options
dictionary
containing a set of default options passed to facet()
and date_facet()
.
Facet serializer example
class PersonFacetSerializer(HaystackFacetSerializer):
serialize_objects = False # Setting this to True will serialize the
# queryset into an `objects` list. This
# is useful if you need to display the faceted
# results. Defaults to False.
class Meta:
index_classes = [PersonIndex]
fields = ["firstname", "lastname", "created"]
field_options = {
"firstname": {},
"lastname": {},
"created": {
"start_date": datetime.now() - timedelta(days=3 * 365),
"end_date": datetime.now(),
"gap_by": "month",
"gap_amount": 3
}
}
The declared field_options
will be used as default options when faceting is applied to the queryset, but can be
overridden by supplying query string parameters in the following format.
?firstname=limit:1&created=start_date:20th May 2014,gap_by:year
Each field can be fed options as key:value
pairs. Multiple key:value
pairs can be supplied and
will be separated by the view.lookup_sep
attribute (which defaults to comma). Any start_date
and end_date
parameters will be parsed by the python-dateutil
parser() (which can handle most
common date formats).
Note
- The
HaystackFacetFilter
parses query string parameter options, separated with theview.lookup_sep
attribute. Each option is parsed askey:value
pairs where the:
is a hardcoded separator. Setting theview.lookup_sep
attribute to":"
will raise an AttributeError.- The date parsing in the
HaystackFacetFilter
does intentionally blow up if fed a string format it can’t handle. No exception handling is done, so make sure to convert values to a format you know it can handle before passing it to the filter. Ie., don’t let your users feed their own values in here ;)Warning
Do not use the
HaystackFacetFilter
in the regularfilter_backends
list on the serializer. It will almost certainly produce errors or weird results. Faceting filters should go in thefacet_filter_backends
list.
Example serialized content
The serialized content will look a little different than the default Haystack faceted output.
The top level items will always be queries, fields and dates, each containing a subset of fields
matching the category. In the example below, we have faceted on the fields firstname and lastname, which will
make them appear under the fields category. We also have faceted on the date field created, which will show up
under the dates category. Next, each faceted result will have a text
, count
and narrow_url
attribute which should be quite self explaining.
{ "queries": {}, "fields": { "firstname": [ { "text": "John", "count": 3, "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=firstname_exact%3AJohn" }, { "text": "Randall", "count": 2, "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=firstname_exact%3ARandall" }, { "text": "Nehru", "count": 2, "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=firstname_exact%3ANehru" } ], "lastname": [ { "text": "Porter", "count": 2, "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=lastname_exact%3APorter" }, { "text": "Odonnell", "count": 2, "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=lastname_exact%3AOdonnell" }, { "text": "Hood", "count": 2, "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=lastname_exact%3AHood" } ] }, "dates": { "created": [ { "text": "2015-05-15T00:00:00", "count": 100, "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=created_exact%3A2015-05-15+00%3A00%3A00" } ] } }
Serializing faceted results¶
When a HaystackFacetSerializer
class determines what fields to serialize, it will check
the serialize_objects
class attribute to see if it is True
or False
. Setting this value to True
will add an additional objects
field to the serialized results, which will contain the results for the
faceted SearchQuerySet
. The results will by default be serialized using the view’s serializer_class
.
If you wish to use a different serializer for serializing the results, set the
drf_haystack.mixins.FacetMixin.facet_objects_serializer_class
class attribute to whatever serializer you want
to use, or override the drf_haystack.mixins.FacetMixin.get_facet_objects_serializer_class()
method.
Example faceted results with paginated serialized objects
{
"fields": {
"firstname": [
{"...": "..."}
],
"lastname": [
{"...": "..."}
]
},
"dates": {
"created": [
{"...": "..."}
]
},
"queries": {},
"objects": {
"count": 3,
"next": "http://example.com/api/v1/search/facets/?page=2&selected_facets=firstname_exact%3AJohn",
"previous": null,
"results": [
{
"lastname": "Baker",
"firstname": "John",
"full_name": "John Baker",
"text": "John Baker\n"
},
{
"lastname": "McClane",
"firstname": "John",
"full_name": "John McClane",
"text": "John McClane\n"
}
]
}
}
Setting up the view¶
Any view that inherits the drf_haystack.mixins.FacetMixin
will have a special
action route added as
^<view-url>/facets/$
. This view action will not care about regular filtering but will by default use the
HaystackFacetFilter
to perform filtering.
Note
In order to avoid confusing the filtering mechanisms in Django Rest Framework, the FacetMixin
class has a couple of hooks for dealing with faceting, namely:
drf_haystack.mixins.FacetMixin.facet_filter_backends
- A list of filter backends that will be used to apply faceting to the queryset. Defaults to :class:drf_haystack.filters.HaystackFacetFilter`, which should be sufficient in most cases.drf_haystack.mixins.FacetMixin.facet_serializer_class
- Thedrf_haystack.serializers.HaystackFacetSerializer
instance that will be used for serializing the result.drf_haystack.mixins.FacetMixin.facet_objects_serializer_class
- Optional. Set to the serializer class which should be used for serializing faceted objects. If not set, defaults toself.serializer_class
.drf_haystack.mixins.FacetMixin.filter_facet_queryset()
- Works exactly as the normaldrf_haystack.generics.HaystackGenericAPIView.filter_queryset()
method, but will only filter on backends in theself.facet_filter_backends
list.drf_haystack.mixins.FacetMixin.get_facet_serializer_class()
- Returns theself.facet_serializer_class
class attribute.drf_haystack.mixins.FacetMixin.get_facet_serializer()
- Instantiates and returns thedrf_haystack.serializers.HaystackFacetSerializer
class returned fromdrf_haystack.mixins.FacetMixin.get_facet_serializer_class()
method.drf_haystack.mixins.FacetMixin.get_facet_objects_serializer()
- Instantiates and returns the serializer class which will be used to serialize faceted objects.drf_haystack.mixins.FacetMixin.get_facet_objects_serializer_class()
- Returns theself.facet_objects_serializer_class
, or if not set, theself.serializer_class
.
In order to set up a view which can respond to regular queries under ie ^search/$
and faceted queries under
^search/facets/$
, we could do something like this.
We can also change the query param text from selected_facets
to our own choice like params
or p
. For this
to make happen please provide facet_query_params_text
attribute as shown in the example.
class SearchPersonViewSet(FacetMixin, HaystackViewSet):
index_models = [MockPerson]
# This will be used to filter and serialize regular queries as well
# as the results if the `facet_serializer_class` has the
# `serialize_objects = True` set.
serializer_class = SearchSerializer
filter_backends = [HaystackHighlightFilter, HaystackAutocompleteFilter]
# This will be used to filter and serialize faceted results
facet_serializer_class = PersonFacetSerializer # See example above!
facet_filter_backends = [HaystackFacetFilter] # This is the default facet filter, and
# can be left out.
facet_query_params_text = 'params' #Default is 'selected_facets'
Narrowing¶
As we have seen in the examples above, the HaystackFacetSerializer
will add a narrow_url
attribute to each
result it serializes. Follow that link to narrow the search result.
The narrow_url
is constructed like this:
- Read all query parameters from the request
- Get a list of
selected_facets
- Update the query parameters by adding the current item to
selected_facets
- Pop the
drf_haystack.serializers.HaystackFacetSerializer.paginate_by_param
parameter if any in order to always start at the first page if returning a paginated result.- Return a
serializers.Hyperlink
with URL encoded query parameters
This means that for each drill-down performed, the original query parameters will be kept in order to make
the HaystackFacetFilter
happy. Additionally, all the previous selected_facets
will be kept and applied
to narrow the SearchQuerySet
properly.
Example narrowed result
{ "queries": {}, "fields": { "firstname": [ { "text": "John", "count": 1, "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=firstname_exact%3AJohn&selected_facets=lastname_exact%3AMcLaughlin" } ], "lastname": [ { "text": "McLaughlin", "count": 1, "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=firstname_exact%3AJohn&selected_facets=lastname_exact%3AMcLaughlin" } ] }, "dates": { "created": [ { "text": "2015-05-15T00:00:00", "count": 1, "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=firstname_exact%3AJohn&selected_facets=lastname_exact%3AMcLaughlin&selected_facets=created_exact%3A2015-05-15+00%3A00%3A00" } ] } }