TFT: an Interpretable Transformer | by Rafael Guedes | Jan, 2024


Thank you for reading this post, don't forget to subscribe!

A deep exploration of TFT, its implementation utilizing Darts and the best way to interpret a Transformer

Rafael Guedes

Towards Data Science

Each firm on the planet wants forecasting to plan their operations regardless the sector through which they function. There are a number of forecast use circumstances to unravel in corporations equivalent to gross sales for yearly planning, customer support contacts for month-to-month planning of brokers for every language, sku gross sales to plan manufacturing and/or procurement and so forth.

Though, there are completely different use circumstances, all of them share one want from their stakeholders: Interpretability! In the event you deployed a forecast mannequin prior to now for a stakeholder, you got here throughout to the query: ‘why is the mannequin making such prediction?’

On this article I discover TFT, an interpretable Transformer for time sequence forecasting. I additionally present a step-by-step implementation of TFT to forecast weekly gross sales in a dataset from Walmart utilizing Darts (a forecasting library for Python). And eventually, I present the best way to interpret the mannequin and its efficiency for a 16 week horizon forecast within the Walmart dataset.

Determine 1: Interpretable AI (picture generated by the writer with DALL-E)

As all the time, the code is on the market on Github.

What’s it?

With regards to time sequence forecasting, normally they’re influenced not solely by their historic values but in addition on different inputs. They may include a mixture of advanced inputs like static covariates (i.e. time-invariant options just like the model of a product), dynamic covariates with identified future inputs just like the product low cost and different dynamic covariates with unknown future inputs such because the variety of guests for the subsequent weeks.

A number of Deep Studying fashions have been proposed to deal with the presence of a number of inputs for time sequence forecasting however they’re sometimes ‘black-box’ fashions which don’t enable to grasp how every element is impacting the forecast produced.

Temporal Fusion Transformers (TFT) [1] is an attention-based structure that mixes multi-horizon forecasting with interpretable insights. It has recurrent layers to study temporal relationships at completely different scales, self-attention layers…



Leave a Reply

Your email address will not be published. Required fields are marked *