Awesome-omni-skill stata-mcp
Run or debug Stata workflows through the local io.github.tmonk/mcp-stata server. Use when users mention Stata commands, .do files, r()/e() results, dataset inspection, Stata graph exports, or data browsing with sorting/filtering.
install
source · Clone the upstream repo
git clone https://github.com/diegosouzapw/awesome-omni-skill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data-ai/stata-mcp" ~/.claude/skills/diegosouzapw-awesome-omni-skill-stata-mcp && rm -rf "$T"
manifest:
skills/data-ai/stata-mcp/SKILL.mdsource content
Stata MCP Skill
Instructions
- Ensure the
MCP server is registered (see project README for config) and request it if not already active.stata - When the user asks for Stata work:
- Use
for ad-hoc syntax (run_command
for call stacks,trace=True
for plain output).raw=True - Use
before analyses that require datasets.load_data - Use
,get_data
,describe
, orcodebook
to inspect data.get_variable_list - Use
for providedrun_do_file
scripts..do - Use
/export_graph
for visualization requests.export_graphs_all - Use
when the user wants Stata documentation.get_help - Use
to returnget_stored_results
/r()
scalars/macros after commands for validation.e() - Use
to tail or retrieve output from long-running commands.read_log - Use
to obtain a localhost HTTP endpoint for high-volume data browsing.get_ui_channel
- Use
- Surface
/rc
info back to the user, referencingstderr
/r()
codes.e() - If Stata isn't auto-discovered, remind the user to set
(examples in README).STATA_PATH
Tool quick reference
Command Execution
-
: Run Stata syntax.run_command(code, echo=True, as_json=True, trace=False, raw=False, max_output_lines=None)
: The Stata command(s) to execute.code
: Include the command itself in output (default: True).echo
: Return JSON envelope with rc/stdout/stderr/error (default: True).as_json
: Enabletrace
for deeper error diagnostics (default: False).set trace on
: Return plain stdout/error message instead of JSON (default: False).raw
: Truncate output to this many lines (default: None for no truncation).max_output_lines- Note: Always writes output to a temporary log file and emits a
withnotifications/logMessage
so the client can tail it locally.{"event":"log_path","path":"..."}
-
: Execute .do files.run_do_file(path, echo=True, as_json=True, trace=False, raw=False, max_output_lines=None)
: Path to the .do file.path
: Include commands in output (default: True).echo
: Return JSON envelope (default: True).as_json
: Enable trace mode for debugging (default: False).trace
: Return plain output instead of JSON (default: False).raw
: Truncate output to this many lines (default: None).max_output_lines- Note: Always writes output to a temporary log file and emits incremental
when the client provides a progress token/callback.notifications/progress
-
: Read a slice of a previously-provided log file.read_log(path, offset=0, max_bytes=65536)
: Path to the log file (frompath
).notifications/logMessage
: Byte offset to start reading from (default: 0).offset
: Maximum bytes to read (default: 65536).max_bytes- Returns JSON:
,path
,offset
,next_offset
.data
Data Loading & Inspection
-
: Load data using sysuse/webuse/use heuristics.load_data(source, clear=True, as_json=True, raw=False, max_output_lines=None)
: Dataset name, URL, or file path (e.g., "auto", "webuse nlsw88", "/path/to/file.dta").source
: Appendclear
to replace existing data (default: True)., clear
: Return JSON envelope (default: True).as_json
: Return plain output (default: False).raw
: Truncate output to this many lines (default: None).max_output_lines- Note: After loading, use UI channel for advanced filtering/sorting at scale.
-
: Retrieve a slice of the active dataset as JSON.get_data(start=0, count=50)
: Zero-based index of first observation (default: 0).start
: Number of observations to retrieve (default: 50, max: 500).count- Note: For advanced sorting/filtering at scale, use the UI channel endpoints (see
).get_ui_channel()
-
: Return variable descriptions, storage types, and labels.describe() -
: Return JSON list of all variables with names, labels, and types.get_variable_list() -
: Return codebook/summary for a specific variable.codebook(variable, as_json=True, trace=False, raw=False, max_output_lines=None)
: Variable name to describe.variable
: Return JSON envelope (default: True).as_json
: Enable trace mode (default: False).trace
: Return plain output (default: False).raw
: Truncate output to this many lines (default: None).max_output_lines
Graph Management
-
: List all graphs in Stata's memory with active graph marked.list_graphs()- Note: Graphs are automatically cached during command execution for instant exports.
-
: Export a stored graph to file.export_graph(graph_name=None, format="pdf")
: Name of graph to export (fromgraph_name
); if None, exports active graph.list_graphs
: Output format—"pdf" (default) or "png". Use "png" to view plots directly.format
-
: Export all graphs in memory. Returns file paths.export_graphs_all()
Help & Results
-
: Return Stata help text.get_help(topic, plain_text=False)
: Command or help topic (e.g., "regress", "graph").topic
: Return plain text instead of Markdown (default: False).plain_text
-
: Return currentget_stored_results()
andr()
results as JSON after a command.e()
Session Management
: Manually create a new Stata session.create_session(session_id)
: List all active sessions and their status (running, idle, etc.).list_sessions()
: Terminate and clean up a specific session.stop_session(session_id)
: Interrupt the currently executing command in a session.break_session(session_id="default")- Use this tool when a command is taking too long or you want to stop a long-running loop without losing data already in memory.
- Follow-up with
to see where execution stopped.read_log
UI Data Browser
: Return a short-lived localhost HTTP endpoint + bearer token for the UI-only data browser.get_ui_channel()- Returns JSON with
,baseUrl
,token
, andexpiresAt
.capabilities - Intended for VS Code extension UI to browse data at high volume (paging, filtering, sorting) without sending large payloads over MCP.
- Loopback only (binds to
), requires bearer auth.127.0.0.1 - Key endpoints (all require
header):Authorization: Bearer <token>
: Dataset identity and stateGET /v1/dataset
: Variable metadataGET /v1/vars
: Page data with optional sorting (POST /v1/page
parameter)sortBy
: Binary Arrow IPC streamPOST /v1/arrow
: Create filtered viewPOST /v1/views
: Page within filtered view (supports sorting)POST /v1/views/:viewId/page
: Arrow stream from filtered viewPOST /v1/views/:viewId/arrow
: Delete viewDELETE /v1/views/:viewId
: Validate filter expressionPOST /v1/filters/validate
- Sorting: Use
array in page requests (e.g.,sortBy
for ascending,["price"]
for descending,["-price"]
for multi-level)["foreign", "-price"] - Filtering: Filter expressions use Python boolean operators (
,==
,!=
,<
,>
,and
); Stata-styleor
/&
also accepted| - Server limits: maxLimit=500, maxVars=32767, maxChars=500, maxRequestBytes=1000000, maxArrowLimit=1000000
- Dataset tracking:
used for cache invalidation; changing dataset invalidates view handlesdatasetId
- Returns JSON with
Cancellation
- Clients may cancel an in-flight request by sending the MCP notification
withnotifications/cancelled
set to the original tool call ID.params.requestId - Pass a
when invoking the tool if you want progress updates (optional)._meta.progressToken - Cancellation is best-effort and depends on Stata surfacing
.BreakError
Error Reporting
- All tools executing Stata commands support JSON envelopes (
) containing:as_json=true
: Return code from r()/c(rc)rc
: Standard outputstdout
: Standard error (captures "red text")stderr
: Error messagemessage
: Line number (when Stata reports it)line
: The command that was executedcommand
: Path to log file for streaming (when applicable)log_path
: Excerpt of error outputsnippet
- Stata-specific error codes (
) are parsed and preservedr(XXX) - Use
to enabletrace=true
for detailed program-defined error diagnosticsset trace on - Set
environment variable (e.g.,MCP_STATA_LOGLEVEL
,DEBUG
) to control server loggingINFO
MCP Resources
The server exposes these resources for MCP clients:
→stata://data/summarysummarize
→stata://data/metadatadescribe
→ graph liststata://graphs/list
→ variable liststata://variables/list
→ stored r()/e() resultsstata://results/stored
Graph review workflow
- Call
to see available plots and identify the active graph.list_graphs() - Use
to fetch file paths for every graph; view them directly in the client.export_graphs_all() - For a single plot, call
to get a viewable file.export_graph(graph_name="GraphName", format="png") - Compare the rendered PNGs to the user spec (titles, axes labels, legends, colors, filters); state whether the graph matches and what to change.
Examples
Run a regression
# Load sample data and run regression load_data("auto") run_command("regress price mpg") get_stored_results() # Retrieve coefficients and statistics
Export a histogram
# Create and export a graph run_command("histogram price") list_graphs() # Confirm graph exists export_graph(graph_name="Graph", format="png") # Export for viewing
Debug a do-file
run_do_file("/path/to/analysis.do", trace=True)
Inspect data structure
load_data("nlsw88", clear=True) describe() get_variable_list() codebook("wage") get_data(start=0, count=10)
Read log output from long-running command
# After run_command emits a log_path notification read_log("/tmp/stata_log_abc123.log", offset=0) # Continue reading with next_offset for incremental output read_log("/tmp/stata_log_abc123.log", offset=4096)
Advanced data browsing with sorting and filtering
# Get UI channel for high-volume data operations get_ui_channel() # Returns baseUrl, token, expiresAt # Example UI channel usage (requires HTTP client): # POST {baseUrl}/v1/page with Authorization: Bearer {token} # Body: {"datasetId":"...","offset":0,"limit":50,"vars":["price","mpg"],"sortBy":["-price"]} # Create filtered view for price < 5000 # POST {baseUrl}/v1/views # Body: {"datasetId":"...","frame":"default","filterExpr":"price < 5000"} # Page through filtered view with sorting # POST {baseUrl}/v1/views/{viewId}/page # Body: {"offset":0,"limit":50,"vars":["price","mpg"],"sortBy":["-price"]}