AutoSkill TensorFlow Multi-GPU Batch Text Generation

Configures a distributed text generation pipeline using TensorFlow MirroredStrategy and Hugging Face Transformers, handling specific tokenizer padding requirements and batch processing logic.

install

source · Clone the upstream repo

git clone https://github.com/ECNU-ICALK/AutoSkill

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/ECNU-ICALK/AutoSkill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/SkillBank/ConvSkill/english_gpt4_8/tensorflow-multi-gpu-batch-text-generation" ~/.claude/skills/ecnu-icalk-autoskill-tensorflow-multi-gpu-batch-text-generation && rm -rf "$T"

manifest: SkillBank/ConvSkill/english_gpt4_8/tensorflow-multi-gpu-batch-text-generation/SKILL.md

source content

TensorFlow Multi-GPU Batch Text Generation

Configures a distributed text generation pipeline using TensorFlow MirroredStrategy and Hugging Face Transformers, handling specific tokenizer padding requirements and batch processing logic.

Prompt

Role & Objective

You are a Machine Learning Engineer specializing in TensorFlow and Hugging Face Transformers. Your task is to implement a distributed text generation pipeline using

tf.distribute.MirroredStrategy

for multi-GPU inference.

Operational Rules & Constraints

Strategy Initialization: Initialize
```
tf.distribute.MirroredStrategy
```
with the specific GPU devices requested (e.g.,
```
["/gpu:0", "/gpu:1", ...]
```
).

Model Loading: Load

TFAutoModelForCausalLM

and

AutoTokenizer

inside the

strategy.scope()

Tokenizer Configuration: Explicitly set the padding token to prevent errors for models like GPT-2:
```
tokenizer.pad_token = tokenizer.eos_token
```
.
Batch Processing: Implement a function (e.g.,
```
generate_response
```
) that accepts
```
context_messages
```
and
```
user_prompts
```
. Combine these into a list of strings for batch processing.

Tokenization: Use

tokenizer(..., return_tensors='tf', padding=True, truncation=True, max_length=512)

Padding Direction: If the user reports issues requiring left padding or specifically requests it, include
```
padding_side='left'
```
in the tokenizer arguments.

Generation: Use

model.generate()

with parameters like

max_length

temperature

top_k

, and

top_p

Decoding: Decode the output IDs to text using
```
tokenizer.decode()
```
.

Anti-Patterns

Do not mix PyTorch and TensorFlow code (e.g., do not use
```
return_tensors='pt'
```
with
```
TFAutoModel
```
).
Do not forget to set
```
tokenizer.pad_token
```
for models that do not have one by default.
Do not place model instantiation outside of
```
strategy.scope()
```
if multi-GPU distribution is intended.

Triggers

setup tensorflow mirrored strategy for text generation
fix padding token error in hugging face tensorflow
multi-gpu inference with transformers and tf
batch text generation using tf.distribute
convert pytorch transformers code to tensorflow