Awesome-Agent-Skills-for-Empirical-Research keras-deep-learning

Build and debug deep learning models with Keras and TensorFlow backend

install
source · Clone the upstream repo
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/43-wentorai-research-plugins/skills/domains/ai-ml/keras-deep-learning" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-keras-deep-learni && rm -rf "$T"
manifest: skills/43-wentorai-research-plugins/skills/domains/ai-ml/keras-deep-learning/SKILL.md
source content

Keras Deep Learning Guide

Overview

Keras is the high-level deep learning API that ships as part of TensorFlow 2.x and is the recommended interface for building, training, and deploying neural networks. Its Sequential and Functional APIs provide a progressive disclosure of complexity: beginners can stack layers in minutes, while researchers can build arbitrary DAG architectures, custom training loops, and multi-output models with the same framework.

This guide covers practical patterns for academic research with Keras, from image classification and sequence modeling to custom loss functions and experiment reproducibility. The focus is on patterns that appear repeatedly in published work -- data loading pipelines, callback orchestration, hyperparameter search, and model introspection -- rather than toy examples.

Keras is particularly strong in rapid prototyping for research papers. Its integration with TensorBoard, Weights & Biases, and tf.data pipelines makes it straightforward to go from idea to reproducible experiment to publication-quality results.

Model Architecture Patterns

Sequential API for Standard Architectures

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# Image classification baseline
model = keras.Sequential([
    layers.Input(shape=(224, 224, 3)),
    layers.Rescaling(1.0 / 255),
    layers.Conv2D(32, 3, activation="relu", padding="same"),
    layers.BatchNormalization(),
    layers.MaxPooling2D(2),
    layers.Conv2D(64, 3, activation="relu", padding="same"),
    layers.BatchNormalization(),
    layers.MaxPooling2D(2),
    layers.Conv2D(128, 3, activation="relu", padding="same"),
    layers.GlobalAveragePooling2D(),
    layers.Dropout(0.3),
    layers.Dense(256, activation="relu"),
    layers.Dense(10, activation="softmax"),
])

model.compile(
    optimizer=keras.optimizers.AdamW(learning_rate=1e-3, weight_decay=1e-4),
    loss="sparse_categorical_crossentropy",
    metrics=["accuracy"],
)

Functional API for Multi-Input/Multi-Output Models

# Multi-input model for multimodal research
image_input = keras.Input(shape=(224, 224, 3), name="image")
text_input = keras.Input(shape=(128,), dtype="int32", name="text")

# Image branch
x_img = keras.applications.EfficientNetV2B0(
    include_top=False, weights="imagenet", input_tensor=image_input
).output
x_img = layers.GlobalAveragePooling2D()(x_img)

# Text branch
x_txt = layers.Embedding(10000, 128)(text_input)
x_txt = layers.Bidirectional(layers.LSTM(64))(x_txt)

# Merge
merged = layers.Concatenate()([x_img, x_txt])
merged = layers.Dense(256, activation="relu")(merged)
merged = layers.Dropout(0.4)(merged)
output = layers.Dense(5, activation="softmax", name="classification")(merged)

model = keras.Model(inputs=[image_input, text_input], outputs=output)

Data Pipeline with tf.data

Efficient data loading is critical for GPU utilization in research experiments:

def build_dataset(file_pattern, batch_size=32, training=True):
    """Build a tf.data pipeline with augmentation for research experiments."""
    dataset = tf.data.Dataset.list_files(file_pattern, shuffle=training)

    def parse_image(path):
        img = tf.io.read_file(path)
        img = tf.image.decode_jpeg(img, channels=3)
        img = tf.image.resize(img, [256, 256])
        label = tf.strings.split(path, os.sep)[-2]
        return img, label

    dataset = dataset.map(parse_image, num_parallel_calls=tf.data.AUTOTUNE)

    if training:
        dataset = dataset.shuffle(1000)
        dataset = dataset.map(
            lambda x, y: (tf.image.random_flip_left_right(x), y),
            num_parallel_calls=tf.data.AUTOTUNE,
        )

    dataset = dataset.batch(batch_size)
    dataset = dataset.prefetch(tf.data.AUTOTUNE)
    return dataset

Training and Callback Orchestration

Reproducible Training Setup

import os
import random
import numpy as np

def set_seed(seed=42):
    """Ensure reproducibility across runs for paper results."""
    os.environ["PYTHONHASHSEED"] = str(seed)
    random.seed(seed)
    np.random.seed(seed)
    tf.random.set_seed(seed)

set_seed(42)

callbacks = [
    keras.callbacks.ModelCheckpoint(
        "best_model.keras", monitor="val_loss", save_best_only=True
    ),
    keras.callbacks.EarlyStopping(
        monitor="val_loss", patience=10, restore_best_weights=True
    ),
    keras.callbacks.ReduceLROnPlateau(
        monitor="val_loss", factor=0.5, patience=5, min_lr=1e-6
    ),
    keras.callbacks.TensorBoard(log_dir="./logs", histogram_freq=1),
    keras.callbacks.CSVLogger("training_log.csv"),
]

history = model.fit(
    train_dataset,
    validation_data=val_dataset,
    epochs=100,
    callbacks=callbacks,
)

Custom Training Loop for Research

@tf.function
def train_step(model, optimizer, x, y, loss_fn):
    with tf.GradientTape() as tape:
        predictions = model(x, training=True)
        loss = loss_fn(y, predictions)
    gradients = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
    return loss

# Custom metric tracking
train_loss = keras.metrics.Mean(name="train_loss")
for epoch in range(num_epochs):
    train_loss.reset_state()
    for x_batch, y_batch in train_dataset:
        loss = train_step(model, optimizer, x_batch, y_batch, loss_fn)
        train_loss.update_state(loss)
    print(f"Epoch {epoch+1}, Loss: {train_loss.result():.4f}")

Debugging and Common Pitfalls

IssueSymptomSolution
Exploding gradientsLoss becomes NaNAdd gradient clipping, reduce learning rate
OverfittingVal loss diverges from train lossAdd Dropout, data augmentation, weight decay
UnderfittingBoth losses plateau highIncrease model capacity, reduce regularization
Slow trainingLow GPU utilizationUse tf.data with prefetch, increase batch size
Memory errorsOOM on GPUReduce batch size, use mixed precision
Non-deterministic resultsDifferent results per runCall
set_seed()
, set
TF_DETERMINISTIC_OPS=1

Mixed Precision Training

# Enable mixed precision for 2x speedup on modern GPUs
keras.mixed_precision.set_global_policy("mixed_float16")

# Ensure the output layer uses float32 for numerical stability
output = layers.Dense(10, activation="softmax", dtype="float32")(x)

Best Practices for Research

  • Version pin everything. Record
    tensorflow
    ,
    keras
    ,
    numpy
    , and
    cuda
    versions in your paper appendix.
  • Use
    keras.utils.set_random_seed(42)
    for full determinism (TF 2.12+).
  • Save models in
    .keras
    format
    (not HDF5) for forward compatibility.
  • Profile with TensorBoard to identify data pipeline bottlenecks before scaling up.
  • Use
    tf.debugging.enable_check_numerics()
    during development to catch NaN/Inf early.
  • Export with
    tf.saved_model
    for deployment; export ONNX for cross-framework comparison.

References