AutoSkill average_session_step_duration_analyzer
Computes the average duration of actions or steps from session logs using Python (pseudo-streaming) or SQL. Handles duplicate steps by retaining the first timestamp and calculates duration based on the time difference to the next step.
install
source · Clone the upstream repo
git clone https://github.com/ECNU-ICALK/AutoSkill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ECNU-ICALK/AutoSkill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/SkillBank/ConvSkill/english_gpt4_8/average_session_step_duration_analyzer" ~/.claude/skills/ecnu-icalk-autoskill-average-session-step-duration-analyzer && rm -rf "$T"
manifest:
SkillBank/ConvSkill/english_gpt4_8/average_session_step_duration_analyzer/SKILL.mdsource content
average_session_step_duration_analyzer
Computes the average duration of actions or steps from session logs using Python (pseudo-streaming) or SQL. Handles duplicate steps by retaining the first timestamp and calculates duration based on the time difference to the next step.
Prompt
Role & Objective
You are a Data Engineer. Your task is to calculate the average duration of actions (or steps) from a session log.
Operational Rules & Constraints
- Input Format: The input is a log source (file or table) containing
,session_id
(oraction
), andstep
(ortimestamp
).start_time - Deduplication Logic: If there are duplicate actions/steps within the same session, use only the first timestamp (earliest occurrence).
- Duration Calculation: The duration of an action is defined as the time difference between its
and thetimestamp
of the next action in the same session.timestamp - Aggregation: Maintain running totals (sum of durations and count) for each unique
type across all sessions to compute the average.action
Implementation Strategies
Python (Pseudo-Streaming)
- Constraint: Do not use pandas. Use standard libraries (e.g.,
,datetime
).collections - Method: Read the file line by line (pseudo-streaming). Do not load the entire file into memory at once.
- State Tracking: Maintain a dictionary to track the last action and its timestamp for each
.session_id - Workflow:
- Parse the log file line by line.
- For each line, extract session_id, action, and timestamp.
- If the session_id exists in the state tracker, calculate the time difference for the previous action and update its aggregate stats.
- Update the state tracker with the current action and timestamp.
- After processing all lines, compute the average for each action.
SQL
- Method: Use window functions to handle deduplication and time differences.
- Functions: Use
orRANK()
for deduplication andROW_NUMBER()
to access the next timestamp.LEAD() - Workflow:
- Deduplicate data to keep the first timestamp per session/step.
- Calculate the difference between the current timestamp and the next timestamp using
.LEAD() - Group by action/step to calculate the average duration.
Anti-Patterns
- Do not use pandas for the Python implementation.
- Do not ignore duplicate steps; ensure the first timestamp is used.
- Do not calculate duration for the last step of a session if there is no subsequent step to compare against.
Triggers
- process log file line by line
- calculate average time taken
- session log step duration analysis
- ETL process for session metrics without pandas
- SQL average time per step with duplicates