Babysitter Lexer Generator
Expert skill for generating and hand-writing lexers using DFA-based, table-driven, and recursive approaches
install
source · Clone the upstream repo
git clone https://github.com/a5c-ai/babysitter
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/a5c-ai/babysitter "$T" && mkdir -p ~/.claude/skills && cp -r "$T/library/specializations/programming-languages/skills/lexer-generator" ~/.claude/skills/a5c-ai-babysitter-lexer-generator && rm -rf "$T"
manifest:
library/specializations/programming-languages/skills/lexer-generator/SKILL.mdsource content
Lexer Generator Skill
Overview
Expert skill for generating and hand-writing lexers using various approaches including DFA-based lexers, table-driven lexers, and hand-written recursive lexers.
Capabilities
- Generate lexer from regular expression specifications
- Implement maximal munch tokenization
- Handle Unicode character classes and normalization
- Implement efficient keyword recognition (tries, perfect hashing)
- Support incremental/resumable lexing for IDE integration
- Generate lexer tables and state machines
- Handle lexer modes and contexts (e.g., string interpolation)
- Implement error recovery with skip-to-next strategies
Target Processes
- lexer-implementation.js
- language-grammar-design.js
- lsp-server-implementation.js
- repl-development.js
Dependencies
- Flex-like generators
- RE2/Hyperscan libraries
Usage Guidelines
- Token Definition: Start by defining the complete set of tokens with their regex patterns
- Maximal Munch: Always implement maximal munch to handle ambiguous token boundaries
- Unicode Support: Consider Unicode normalization forms and character classes from the start
- Error Recovery: Implement skip-to-next-valid strategies for robust error handling
- Performance: Use table-driven approaches for large token sets, hand-written for simple lexers
Output Schema
{ "type": "object", "properties": { "tokens": { "type": "array", "items": { "type": "object", "properties": { "name": { "type": "string" }, "pattern": { "type": "string" }, "priority": { "type": "integer" } } } }, "lexerType": { "type": "string", "enum": ["dfa", "table-driven", "hand-written"] }, "generatedFiles": { "type": "array", "items": { "type": "string" } } } }