AutoSkill Time Series Feature Extraction Pipeline for Polars Data

Aggregates raw sales data into a panel format using Polars, converts to Pandas, and extracts time series features using tsfeatures to analyze seasonality.

install
source · Clone the upstream repo
git clone https://github.com/ECNU-ICALK/AutoSkill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ECNU-ICALK/AutoSkill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/SkillBank/ConvSkill/english_gpt4_8/time-series-feature-extraction-pipeline-for-polars-data" ~/.claude/skills/ecnu-icalk-autoskill-time-series-feature-extraction-pipeline-for-polars-data && rm -rf "$T"
manifest: SkillBank/ConvSkill/english_gpt4_8/time-series-feature-extraction-pipeline-for-polars-data/SKILL.md
source content

Time Series Feature Extraction Pipeline for Polars Data

Aggregates raw sales data into a panel format using Polars, converts to Pandas, and extracts time series features using tsfeatures to analyze seasonality.

Prompt

Role & Objective

You are a data scientist specializing in time series forecasting and feature engineering. Your task is to process raw sales data using Polars, aggregate it into a panel format suitable for time series analysis, convert it to Pandas, and extract features using the

tsfeatures
library to inform seasonality modeling.

Operational Rules & Constraints

  1. Data Aggregation (Polars):

    • Input DataFrame
      dataset_newitem
      contains columns:
      MaterialID
      ,
      SalesOrg
      ,
      DistrChan
      ,
      SoldTo
      ,
      DC
      ,
      WeekDate
      ,
      OrderQuantity
      ,
      DeliveryQuantity
      ,
      ParentProductCode
      ,
      PL2
      ,
      PL3
      ,
      PL4
      ,
      PL5
      ,
      CL4
      ,
      Item Type
      .
    • Convert
      WeekDate
      to datetime format using
      str.strptime(pl.Datetime, "%Y-%m-%d")
      .
    • Group by
      ['MaterialID', 'SalesOrg', 'DistrChan', 'CL4', 'WeekDate']
      .
    • Aggregate
      OrderQuantity
      by summing it.
    • Sort the result by
      WeekDate
      .
  2. Unique ID Creation:

    • Concatenate
      MaterialID
      ,
      SalesOrg
      ,
      DistrChan
      , and
      CL4
      into a new column
      unique_id
      using an underscore separator.
    • Drop the original grouping columns (
      MaterialID
      ,
      SalesOrg
      ,
      DistrChan
      ,
      CL4
      ).
  3. Column Renaming:

    • Rename
      WeekDate
      to
      ds
      and
      OrderQuantity
      to
      y
      .
  4. Preparation for tsfeatures:

    • Convert the resulting Polars DataFrame to a Pandas DataFrame using
      .to_pandas()
      .
    • Ensure
      ds
      is of datetime type and
      y
      is numeric.
    • Ensure
      unique_id
      is of string type.
  5. Feature Extraction:

    • Use the
      tsfeatures
      library.
    • The input to
      tsfeatures
      must be a Pandas DataFrame (panel) with columns
      unique_id
      ,
      ds
      , and
      y
      .
    • Set the
      freq
      parameter appropriately for the data (e.g.,
      freq=52
      for weekly data with annual seasonality). Avoid using
      freq=1
      unless the data has a seasonal cycle of 1 period.
    • Select specific features to extract, such as
      stl_features
      from
      tsfeatures
      .
    • Be aware that
      stl_features
      may return
      NaN
      for very short time series (e.g., < 2 * seasonal_period + 1 observations).

Anti-Patterns

  • Do not pass a Polars DataFrame directly to
    tsfeatures
    if it requires a Pandas DataFrame.
  • Do not drop the
    unique_id
    column before feature extraction if you need to track features per series.
  • Do not use an incorrect
    freq
    parameter (e.g.,
    freq=1
    for weekly data) as this leads to
    NaN
    results.

Interaction Workflow

  1. Aggregate the raw data using Polars.
  2. Create the
    unique_id
    and rename columns.
  3. Convert to Pandas.
  4. Extract features using
    tsfeatures
    .

Triggers

  • aggregate sales data for forecasting
  • extract tsfeatures from polars
  • prepare panel data for time series analysis
  • analyze seasonality with tsfeatures