--- title: "Forecasting with GenAI" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Forecasting with GenAI } %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- Generative AI for forecasting is powered by transformer architectures which are neural networks designed to model long-range dependencies in sequences. A time-series transformer is pre-trained on massive, diverse datasets, learning a general “language of time series” (recurring structures, seasonality, and signal vs. noise). Because this knowledge is broadly applicable, the foundation model can forecast new, unseen series immediately often with strong accuracy without task-specific training. This is zero‑shot forecasting: you bring data, the model brings general knowledge. When you need domain‑specific nuance, you can fine‑tune the foundation model on your own historical data. Fine‑tuning adapts the pre‑trained knowledge to your context, typically improving accuracy and stability for your use cases while keeping training cost far lower than training from scratch. ## Chronos2 Chronos2 is a foundation model for time series forecasting developed by [Amazon](https://huggingface.co/amazon/chronos-2) - Zero‑shot forecasts with prediction intervals. - It supports univariate, multivariate, and covariate-informed tasks. - Optional use of both historical and future exogenous variables. - Can be easily deployed for inference ## Authentication and Setup Finnts leverages chronos2, via azure self deployed model. The model weights were downloaded from [huggingface/chronos2](https://huggingface.co/amazon/chronos-2) and was deployed as an azure ml endpoint. For detailed steps on hosting refer to [Azure ML documentation](https://learn.microsoft.com/en-us/azure/machine-learning/concept-endpoints-online?view=azureml-api-2). After the endpoint is up and running, set the base URL and API key as environment variables: ```{r eval=FALSE} # In R (or add to .Renviron) Sys.setenv(CHRONOS_API_URL = "https://xyz.inference.ml.azure.com/score") Sys.setenv(CHRONOS_API_TOKEN = "your api key here") ``` ## Using Chronos2 within finnts Chronos2 is available as a model within finnts unified workflow and can run as both a **local** (per time series) or **global** (across all time series) model. When running as a global model, Chronos2 learns patterns across all your time series simultaneously. ```{r eval=FALSE} library(finnts) # setup chronos2 api keys hist_data <- timetk::m4_monthly %>% dplyr::filter( date >= "2010-01-01", id == "M2" ) %>% dplyr::rename(Date = date) %>% dplyr::mutate(id = as.character(id)) run_info <- set_run_info( project_name = "finnts_fcst", run_name = "finn_sub_component_run" ) prep_data( run_info = run_info, input_data = hist_data, combo_variables = c("id"), target_variable = "value", date_type = "month", forecast_horizon = 6 ) prep_models( run_info = run_info, models_to_run = c("chronos2"), ) train_models( run_info = run_info, run_global_models = FALSE ) final_models(run_info = run_info) finn_output_tbl <- get_forecast_data(run_info = run_info) head(finn_output_tbl) # A tibble: 6 × 17 # Combo id Model_ID Model_Name Model_Type Recipe_ID Run_Type Train_Test_ID Best_Model Horizon Date Target Forecast lo_95 lo_80 hi_80 hi_95 # # 1 M2 M2 chronos2--local--R1 chronos2 local R1 Future_Forecast 1 Yes 1 2015-07-01 NA 2342. 1573. 1840. 2844. 3111. # 2 M2 M2 chronos2--local--R1 chronos2 local R1 Future_Forecast 1 Yes 2 2015-08-01 NA 2128. 1359. 1626. 2630. 2897. # 3 M2 M2 chronos2--local--R1 chronos2 local R1 Future_Forecast 1 Yes 3 2015-09-01 NA 1849. 1080. 1347. 2351. 2618. # 4 M2 M2 chronos2--local--R1 chronos2 local R1 Future_Forecast 1 Yes 4 2015-10-01 NA 1890. 1121. 1388. 2393. 2659. # 5 M2 M2 chronos2--local--R1 chronos2 local R1 Future_Forecast 1 Yes 5 2015-11-01 NA 1868. 1098. 1365. 2370. 2637. # 6 M2 M2 chronos2--local--R1 chronos2 local R1 Future_Forecast 1 Yes 6 2015-12-01 NA 1731. 961. 1228. 2233. 2500. ``` ## Data Size Requirements and Automatic Padding Azure ML Chronos2 endpoints have minimum data size requirements of 3 rows irrespective of data frequency. When using Azure ML endpoints, finnts automatically pads your data with zeros (backward from the earliest date) to meet these requirements. This ensures your forecasts work even with smaller datasets. The padding only affects the data sent to the API, your original training data remains unchanged in the model fit object. ## Global Model Support Chronos2 can run as both a **local** (per time series) or **global** (across all time series) model. When running as a global model, Chronos2 learns patterns across all your time series simultaneously. ```{r eval=FALSE} # Run Chronos2 as a global model train_models( run_info = run_info, run_global_models = TRUE # Chronos2 will train on all combos together ) prep_models( run_info = run_info, models_to_run = c("chronos2"), ) ``` ## Chronos Bolt Base Chronos Bolt Base is a lighter-weight foundation model from the [Chronos family](https://huggingface.co/amazon/chronos-bolt-base) optimized for fast inference. - Zero-shot forecasts with prediction intervals. - Univariate only — **does not support external regressors**. - Runs as a **local** (per time series) model only — not global. - Uses the same Chronos API endpoint and credentials as Chronos2. ### Authentication and Setup Chronos Bolt Base uses the same API endpoint and credentials as Chronos2. If you have already configured Chronos2, no additional setup is needed. ```{r eval=FALSE} Sys.setenv(CHRONOS_API_URL = "https://xyz.inference.ml.azure.com/score") Sys.setenv(CHRONOS_API_TOKEN = "your api key here") ``` ### Using Chronos Bolt Base within finnts ```{r eval=FALSE} library(finnts) hist_data <- timetk::m4_monthly %>% dplyr::filter( date >= "2010-01-01", id == "M2" ) %>% dplyr::rename(Date = date) %>% dplyr::mutate(id = as.character(id)) run_info <- set_run_info( project_name = "finnts_fcst", run_name = "chronos_bolt_base_run" ) prep_data( run_info = run_info, input_data = hist_data, combo_variables = c("id"), target_variable = "value", date_type = "month", forecast_horizon = 6 ) prep_models( run_info = run_info, models_to_run = c("chronos-bolt-base"), ) train_models( run_info = run_info, run_global_models = FALSE ) final_models(run_info = run_info) finn_output_tbl <- get_forecast_data(run_info = run_info) head(finn_output_tbl) # A tibble: 6 × 17 # Combo id Model_ID Model_Name Model_Type Recipe_ID Run_Type Train_Test_ID Best_Model Horizon Date Target Forecast lo_95 # # 1 M2 M2 chronos-… chronos-b… local R1 Future_… 1 Yes 1 2015-07-01 NA 2192 1401. # 2 M2 M2 chronos-… chronos-b… local R1 Future_… 1 Yes 2 2015-08-01 NA 2080 1289. # 3 M2 M2 chronos-… chronos-b… local R1 Future_… 1 Yes 3 2015-09-01 NA 2000 1209. # 4 M2 M2 chronos-… chronos-b… local R1 Future_… 1 Yes 4 2015-10-01 NA 2048 1257. # 5 M2 M2 chronos-… chronos-b… local R1 Future_… 1 Yes 5 2015-11-01 NA 1976 1185. # 6 M2 M2 chronos-… chronos-b… local R1 Future_… 1 Yes 6 2015-12-01 NA 1864 1073. # # ℹ 3 more variables: lo_80 , hi_80 , hi_95 ``` ## Chronos Bolt Tiny Chronos Bolt Tiny is the smallest and fastest model from the [Chronos family](https://huggingface.co/amazon/chronos-bolt-tiny), optimized for lowest resource usage and fastest inference. - Zero-shot forecasts with prediction intervals. - Univariate only — **does not support external regressors**. - Runs as a **local** (per time series) model only — not global. - Uses the same Chronos API endpoint and credentials as Chronos2 and Chronos Bolt Base. ### Authentication and Setup Chronos Bolt Tiny uses the same API endpoint and credentials as Chronos2 and Chronos Bolt Base. If you have already configured Chronos2, no additional setup is needed. ```{r eval=FALSE} Sys.setenv(CHRONOS_API_URL = "https://xyz.inference.ml.azure.com/score") Sys.setenv(CHRONOS_API_TOKEN = "your api key here") ``` ### Using Chronos Bolt Tiny within finnts ```{r eval=FALSE} library(finnts) hist_data <- timetk::m4_monthly %>% dplyr::filter( date >= "2010-01-01", id == "M2" ) %>% dplyr::rename(Date = date) %>% dplyr::mutate(id = as.character(id)) run_info <- set_run_info( project_name = "finnts_fcst", run_name = "chronos_bolt_tiny_run" ) prep_data( run_info = run_info, input_data = hist_data, combo_variables = c("id"), target_variable = "value", date_type = "month", forecast_horizon = 6 ) prep_models( run_info = run_info, models_to_run = c("chronos-bolt-tiny"), ) train_models( run_info = run_info, run_global_models = FALSE ) final_models(run_info = run_info) finn_output_tbl <- get_forecast_data(run_info = run_info) head(finn_output_tbl) # A tibble: 6 × 17 # Combo id Model_ID Model_Name Model_Type Recipe_ID Run_Type Train_Test_ID Best_Model Horizon Date Target Forecast lo_95 # # 1 M2 M2 chronos-… chronos-b… local R1 Future_… 1 Yes 1 2015-07-01 NA 2192 1411. # 2 M2 M2 chronos-… chronos-b… local R1 Future_… 1 Yes 2 2015-08-01 NA 2096 1315. # 3 M2 M2 chronos-… chronos-b… local R1 Future_… 1 Yes 3 2015-09-01 NA 2064 1283. # 4 M2 M2 chronos-… chronos-b… local R1 Future_… 1 Yes 4 2015-10-01 NA 2080 1299. # 5 M2 M2 chronos-… chronos-b… local R1 Future_… 1 Yes 5 2015-11-01 NA 2024 1243. # 6 M2 M2 chronos-… chronos-b… local R1 Future_… 1 Yes 6 2015-12-01 NA 1936 1155. # # ℹ 3 more variables: lo_80 , hi_80 , hi_95 ``` ## TimesFM [TimesFM](https://github.com/google-research/timesfm) is a foundation model for time series forecasting developed by Google Research. It delivers zero-shot forecasts across different domains and frequencies. - Zero-shot forecasts without any local training. - Univariate only — **does not support external regressors**. - Runs as a **local** (per time series) model only — not global. ### Authentication and Setup TimesFM requires its own API endpoint and token, configured via environment variables: ```{r eval=FALSE} Sys.setenv(TIMESFM_API_URL = "https://your-timesfm-endpoint.inference.ml.azure.com/score") Sys.setenv(TIMESFM_API_TOKEN = "your api key here") ``` ### Using TimesFM within finnts ```{r eval=FALSE} library(finnts) hist_data <- timetk::m4_monthly %>% dplyr::filter( date >= "2010-01-01", id == "M2" ) %>% dplyr::rename(Date = date) %>% dplyr::mutate(id = as.character(id)) run_info <- set_run_info( project_name = "finnts_fcst", run_name = "timesfm_run" ) prep_data( run_info = run_info, input_data = hist_data, combo_variables = c("id"), target_variable = "value", date_type = "month", forecast_horizon = 6 ) prep_models( run_info = run_info, models_to_run = c("timesfm"), ) train_models( run_info = run_info, run_global_models = FALSE ) final_models(run_info = run_info) finn_output_tbl <- get_forecast_data(run_info = run_info) head(finn_output_tbl) # A tibble: 6 × 17 # Combo id Model_ID Model_Name Model_Type Recipe_ID Run_Type Train_Test_ID Best_Model Horizon Date Target Forecast lo_95 # # 1 M2 M2 timesfm-… timesfm local R1 Future_… 1 Yes 1 2015-07-01 NA 2394. 1730. # 2 M2 M2 timesfm-… timesfm local R1 Future_… 1 Yes 2 2015-08-01 NA 2082. 1418. # 3 M2 M2 timesfm-… timesfm local R1 Future_… 1 Yes 3 2015-09-01 NA 1977. 1313. # 4 M2 M2 timesfm-… timesfm local R1 Future_… 1 Yes 4 2015-10-01 NA 2025. 1360. # 5 M2 M2 timesfm-… timesfm local R1 Future_… 1 Yes 5 2015-11-01 NA 1841. 1177. # 6 M2 M2 timesfm-… timesfm local R1 Future_… 1 Yes 6 2015-12-01 NA 1672. 1008. ``` ## TimeGPT TimeGPT is one such transformer based foundation model for time series developed by [Nixtla](https://nixtla.github.io/nixtlar/). It delivers: - Zero‑shot forecasts with prediction intervals. - Support for multiple series. - Optional use of both historical and future exogenous variables. - A clear path to fine‑tuning when deeper domain adaptation is required. ## Authentication and Setup You can setup TimeGPT via either of the following: - Azure-deployed TimeGPT [TimeGen-1](https://nixtla.github.io/nixtlar/articles/azure-quickstart.html) - [Nixtla API](https://www.nixtla.io/docs/setup/setting_up_your_api_key) ### Azure-hosted TimeGPT Set the Azure base URL and API key. #### Option A: Environment variables ```{r eval=FALSE} # In R (or add to .Renviron) Sys.setenv(NIXTLA_BASE_URL = "https://your-azure-deployed-timegen.azure.com/") Sys.setenv(NIXTLA_API_KEY = "your_api_key_here") # Note that the current version of nixtlar requires base url to end with "/" ``` #### Option B: Using nixtlar helper ```{r eval=FALSE} # devtools::install_github("Nixtla/nixtlar") library(nixtlar) nixtla_client_setup( base_url = "Base URL here", api_key = "API key here" ) ``` ### Nixtla API If you use API directly from Nixtla, base url is not required. #### Option A: Environment variable ```{r eval=FALSE} Sys.setenv(NIXTLA_API_KEY = "your_api_key_here") ``` #### Option B: Using nixtlar helper ```{r eval=FALSE} # devtools::install_github("Nixtla/nixtlar") library(nixtlar) nixtla_set_api_key(api_key = "Your API key here") ``` ## Quick start with nixtlar (standalone) ```{r eval=FALSE} # devtools::install_github("Nixtla/nixtlar") library(nixtlar) df <- nixtlar::electricity # Forecast next 8 steps fcst <- nixtla_client_forecast( df, h = 8, level = c(80, 95), # if using azure deployed timegen # model = "azureai" ) head(fcst) ``` ## Using TimeGPT within finnts TimeGPT is available as a model within finnts unified workflow and can run as both a **local** (per time series) or **global** (across all time series) model. When running as a global model, TimeGPT learns patterns across all your time series simultaneously. ```{r eval=FALSE} library(finnts) # setup timegpt api keys # Checkout these data requirements provided by nixtla # https://www.nixtla.io/docs/data_requirements/data_requirements hist_data <- timetk::m4_monthly %>% dplyr::filter( date >= "2010-01-01", id == "M2" ) %>% dplyr::rename(Date = date) %>% dplyr::mutate(id = as.character(id)) run_info <- set_run_info( project_name = "finnts_fcst", run_name = "finn_sub_component_run" ) prep_data( run_info = run_info, input_data = hist_data, combo_variables = c("id"), target_variable = "value", date_type = "month", forecast_horizon = 6 ) prep_models( run_info = run_info, models_to_run = c("timegpt"), ) train_models( run_info = run_info, run_global_models = FALSE ) final_models(run_info = run_info) finn_output_tbl <- get_forecast_data(run_info = run_info) head(finn_output_tbl) # A tibble: 6 x 17 # Combo id Model_ID Model_Name Model_Type Recipe_ID Run_Type Train_Test_ID Best_Model Horizon # # 1 M2 M2 timegpt--loc~ timegpt local R1 Future_~ 1 Yes 1 # 2 M2 M2 timegpt--loc~ timegpt local R1 Future_~ 1 Yes 2 # 3 M2 M2 timegpt--loc~ timegpt local R1 Future_~ 1 Yes 3 # 4 M2 M2 timegpt--loc~ timegpt local R1 Future_~ 1 Yes 4 # 5 M2 M2 timegpt--loc~ timegpt local R1 Future_~ 1 Yes 5 # 6 M2 M2 timegpt--loc~ timegpt local R1 Future_~ 1 Yes 6 # i 7 more variables: Date , Target , Forecast , lo_95 , lo_80 , # hi_80 , hi_95 ``` ## Data Size Requirements and Automatic Padding Azure AI TimeGPT endpoints have minimum data size requirements based on data frequency: - **Daily data**: Minimum 300 rows - **Weekly data**: Minimum 64 rows - **Monthly/Quarterly/Yearly data**: Minimum 48 rows When using Azure AI endpoints, finnts automatically pads your data with zeros (backward from the earliest date) to meet these requirements. This ensures your forecasts work even with smaller datasets. The padding only affects the data sent to the API, your original training data remains unchanged in the model fit object. **Note:** Padding is only applied for Azure AI endpoints. The default Nixtla API does not have these constraints. ## Long-Horizon Forecasting For forecasts exceeding two seasonal periods, TimeGPT automatically uses the `timegpt-1-long-horizon` model, which is optimized for longer forecast horizons. The threshold for "long horizon" depends on your data frequency: - **Daily**: > 14 days (2 weeks) - **Weekly**: > 104 weeks (2 years) - **Monthly**: > 24 months (2 years) - **Quarterly**: > 8 quarters (2 years) - **Yearly**: > 2 years This selection happens automatically based on your `forecast_horizon` and `date_type` parameters—no additional configuration needed. ## Hyperparameter Tuning TimeGPT supports hyperparameter tuning for fine-tuning parameters: - **`finetune_steps`**: Number of fine-tuning steps (range: 0-200) - **`finetune_depth`**: Fine-tuning depth/layers (range: 1-5) These parameters are automatically tuned when you set `num_hyperparameters` in `prep_models()`. The tuning process uses validation splits to find optimal values, then refits the model with the best parameters. ```{r eval=FALSE} prep_models( run_info = run_info, models_to_run = c("timegpt"), num_hyperparameters = 4 # Will tune finetune_steps and finetune_depth ) ``` ## Global Model Support TimeGPT can run as both a **local** (per time series) or **global** (across all time series) model. When running as a global model, TimeGPT learns patterns across all your time series simultaneously. ```{r eval=FALSE} # Run TimeGPT as a global model train_models( run_info = run_info, run_global_models = TRUE # TimeGPT will train on all combos together ) prep_models( run_info = run_info, models_to_run = c("timegpt"), ) ```