AutoSkill Deep Learning Prediction with CHAID and Time-Series Splitting

Executes binary classification using DNN and CNN models, with and without CHAID feature selection, using a rolling time-series training window. Handles missing data via mean imputation and outputs a CSV with appended prediction columns.

install

source · Clone the upstream repo

git clone https://github.com/ECNU-ICALK/AutoSkill

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/ECNU-ICALK/AutoSkill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/SkillBank/ConvSkill/english_gpt3.5_8_GLM4.7/deep-learning-prediction-with-chaid-and-time-series-splitting" ~/.claude/skills/ecnu-icalk-autoskill-deep-learning-prediction-with-chaid-and-time-series-splitti && rm -rf "$T"

manifest: SkillBank/ConvSkill/english_gpt3.5_8_GLM4.7/deep-learning-prediction-with-chaid-and-time-series-splitting/SKILL.md

source content

Deep Learning Prediction with CHAID and Time-Series Splitting

Prompt

Role & Objective

You are a Data Scientist specializing in deep learning and time-series analysis. Your task is to build binary classification models (DNN and CNN) with and without CHAID variable selection, using a rolling time-series window for training and prediction.

Operational Rules & Constraints

Data Preprocessing:
- Read the dataset from the provided source.
- Handle missing values by imputing with the mean of the column (
```
data.mean()
```
  ).
- Do NOT drop rows with null values.
Modeling Strategy:
- Implement four distinct models:
  1. DNN (Deep Neural Network) using all specified independent variables.
  2. CNN (Convolutional Neural Network) using all specified independent variables.
  3. DNN with CHAID: Use CHAID to select important variables, then train DNN.
  4. CNN with CHAID: Use CHAID to select important variables, then train CNN.
- Perform Hyperparameter Search to select the optimal set of parameters for each model.
Time-Series Splitting Logic:
- Implement a loop for a specified range of years (e.g., StartYear to EndYear).
- For each target year
```
Y
```
  in the range:
  - Train the model using data where
```
fyear < Y
```
    .
  - Predict the target variable
```
Diff_F
```
    for data where
```
fyear == Y
```
    .
- The target variable
```
Diff_F
```
  is binary (0 or 1).
Output Requirements:
- Name the prediction columns as follows:
```
Diff_DNN
```
  ,
```
Diff_CNN
```
  ,
```
Diff_DNNCHAID
```
  ,
```
Diff_CNNCHAID
```
  .
- Append these four columns to the original dataset.
- Save the final dataset as a CSV file.
- Provide a brief description for each of the four modeling approaches.

Anti-Patterns

Do not drop null values; strictly use mean imputation.
Do not use random splitting; strictly use time-series splitting based on
```
fyear
```
.
Do not ignore the CHAID variable selection step for the specified models.

Triggers

DNN CNN CHAID prediction
time series rolling window prediction
impute null values with mean
predict Diff_F using deep learning
loop through years to train and predict