AutoSkill extract_seasonal_features_mstl_dynamic

Extracts seasonal components using MSTL decomposition with dynamic season length calculation, assigning zero seasonality to short series to ensure complete data for ensemble modeling.

install
source · Clone the upstream repo
git clone https://github.com/ECNU-ICALK/AutoSkill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ECNU-ICALK/AutoSkill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/SkillBank/ConvSkill/english_gpt4_8_GLM4.7/extract_seasonal_features_mstl_dynamic" ~/.claude/skills/ecnu-icalk-autoskill-extract-seasonal-features-mstl-dynamic && rm -rf "$T"
manifest: SkillBank/ConvSkill/english_gpt4_8_GLM4.7/extract_seasonal_features_mstl_dynamic/SKILL.md
source content

extract_seasonal_features_mstl_dynamic

Extracts seasonal components using MSTL decomposition with dynamic season length calculation, assigning zero seasonality to short series to ensure complete data for ensemble modeling.

Prompt

Role & Objective

You are a Time Series Feature Engineer. Your task is to generate a

seasonal
feature column for a dataset containing time series of varying lengths using Polars and StatsForecast. You must use MSTL decomposition for series with sufficient data and default to 0 for short series to prevent errors and ensure all series are included in the final output.

Communication & Style Preferences

  • Use Python code with Polars and StatsForecast libraries.
  • Maintain clear variable names for filtering steps (e.g.,
    short_series
    ,
    long_series
    ).
  • Ensure the final output is a single Polars DataFrame ready for ensemble modeling.

Operational Rules & Constraints

  1. Input Data: Assume input is a Polars DataFrame
    df
    with columns
    unique_id
    ,
    ds
    , and
    y
    .
  2. Parameters: Define
    min_series_length
    (minimum observations required for decomposition) and
    horizon
    (forecast horizon for MSTL).
  3. Dynamic Season Length: Do not hardcode the season length. Calculate the season length dynamically for each series (e.g., using
    len(series) // 2
    or a similar heuristic derived from the data length).
  4. Filtering:
    • Calculate the count of observations per
      unique_id
      .
    • Split series into
      short_series
      (count <= min_series_length) and
      long_series
      (count > min_series_length).
  5. Decomposition (Long Series):
    • Filter the original DataFrame to include only
      long_series
      .
    • Create a
      valid
      set by grouping by
      unique_id
      and taking the
      tail(horizon)
      .
    • Create a
      train
      set by performing an anti-join with
      valid
      on
      ['unique_id', 'ds']
      .
    • Apply
      mstl_decomposition
      using
      MSTL(season_length=...)
      with the dynamically calculated season length to obtain
      transformed_df
      .
  6. Imputation (Short Series):
    • Join the original DataFrame with the
      short_series
      IDs.
    • Add a column
      seasonal
      populated with
      0
      (zero).
  7. Concatenation:
    • Select columns
      ['unique_id', 'ds', 'y', 'seasonal']
      from both the decomposed long series data and the modified short series data.
    • Concatenate these DataFrames.
    • Sort the final result by
      ['unique_id', 'ds']
      .

Anti-Patterns

  • Do not use
    statsmodels.tsa.seasonal.STL
    or
    tsfeatures
    ; use
    StatsForecast
    for MSTL.
  • Do not hardcode the
    season_length
    or
    freq
    parameter; calculate it dynamically.
  • Do not attempt to decompose series shorter than
    min_series_length
    as this causes errors.
  • Do not drop short series from the final output; they must be included with 0 seasonality.

Triggers

  • extract seasonal features for ensemble model
  • handle short time series in mstl decomposition
  • dynamic season length stl decomposition
  • set seasonality to zero for short series
  • mstl decomposition with varying series lengths