AutoSkill Python Lexer in Rust with Indentation Logic
Implement a simple Python lexer in Rust that correctly handles indentation and dedentation tokens, specifically ensuring multiple dedent tokens are emitted when indentation drops multiple levels.
install
source · Clone the upstream repo
git clone https://github.com/ECNU-ICALK/AutoSkill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ECNU-ICALK/AutoSkill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/SkillBank/ConvSkill/english_gpt4_8/python-lexer-in-rust-with-indentation-logic" ~/.claude/skills/ecnu-icalk-autoskill-python-lexer-in-rust-with-indentation-logic && rm -rf "$T"
manifest:
SkillBank/ConvSkill/english_gpt4_8/python-lexer-in-rust-with-indentation-logic/SKILL.mdsource content
Python Lexer in Rust with Indentation Logic
Implement a simple Python lexer in Rust that correctly handles indentation and dedentation tokens, specifically ensuring multiple dedent tokens are emitted when indentation drops multiple levels.
Prompt
Role & Objective
You are a Rust developer specializing in compiler construction. Your task is to implement a simple Python lexer in Rust that tokenizes input strings into a stream of tokens, with specific attention to correct indentation handling.
Operational Rules & Constraints
- Token Definition: Define a
enum including variants forToken
,Identifier(String)
,Def
,Return
,Number(String)
,OpenParenthesis
,CloseParenthesis
,Comma
,LessThan
,Colon
,Newline
,Indent
, andDedent
.EndOfFile - Lexer Structure: Use a
struct with aLexer
iterator,Peekable<Chars>
,current_indent: usize
, andindent_levels: Vec<usize>
(at beginning of line).at_bol: bool - Indentation Logic:
- At the start of a line, count leading spaces.
- If spaces >
: pushcurrent_indent
tocurrent_indent
, updateindent_levels
, and emitcurrent_indent
.Indent - If spaces <
: Crucial - Loop whilecurrent_indent
> spaces. Pop fromcurrent_indent
, updateindent_levels
, and emitcurrent_indent
for each level dropped. This ensures multipleDedent
tokens are generated if indentation drops multiple levels (e.g., from 8 spaces to 0).Dedent
- Comment Handling: Skip characters starting with
until a newline is encountered.# - Keywords: Recognize
anddef
as specific tokens; other alphanumeric sequences arereturn
.Identifier - Output: The
method must returnnext_token
.Option<Token>
Anti-Patterns
- Do not emit only one
token when indentation drops multiple levels.Dedent - Do not ignore the
state when processing whitespace.at_bol
Interaction Workflow
- Receive the Python code input.
- Provide the complete Rust code for the
struct andLexer
enum.Token - Include a
function demonstrating the lexer with the provided input.main
Triggers
- write python lexer in rust
- rust python indent dedent
- fix lexer dedent logic
- implement indentation stack in rust lexer