Skills amazon-review-workbook
Collect all customer reviews from an Amazon product URL or product-reviews URL through a logged-in Chrome session on port 9222, export a 14-column factual workbook, optionally fill translations through DeepLX, and then help the model tag the rows into a final delivery-ready spreadsheet. Use when the user sends an Amazon link and wants review scraping, competitor review analysis, review export, or a delivery-ready spreadsheet with usernames, review links, review time, helpful votes, translation, summary, sentiment, categories, and tags.
git clone https://github.com/openclaw/skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/openclaw/skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/aduo6668/amazon-review-workbook" ~/.claude/skills/clawdbot-skills-amazon-review-workbook && rm -rf "$T"
skills/aduo6668/amazon-review-workbook/SKILL.mdAmazon Review Workbook
Turn an Amazon product or review link into a two-phase delivery workbook.
This skill is designed to be portable: the scripts live inside the skill folder and do not depend on
dashcamauto or any other local repo.
Quick Path
- If this is the first run on a machine, read references/setup.md.
- Run a quick health check:
python scripts/amazon_review_workbook.py doctor --url "<amazon-url>"
- Run factual collection:
python scripts/amazon_review_workbook.py intake --url "<amazon-url>" --output-dir "<workspace>/amazon-review-output"
- If DeepLX is configured and reachable, fill
:评论中文版
python scripts/amazon_review_workbook.py translate --input-json "<workspace>/amazon-review-output/amazon_<asin>_review_rows_factual.json" --output-dir "<workspace>/amazon-review-output"
- Check coverage before deciding whether keyword expansion is worth the extra requests:
python scripts/amazon_review_workbook.py coverage-check --url "<amazon-url>" --db-path "<workspace>/amazon-review-output/amazon_review_cache.sqlite3"
- Build canonical tags and a lightweight tagging payload:
python scripts/amazon_review_workbook.py taxonomy-bootstrap --input-json "<workspace>/amazon-review-output/amazon_<asin>_review_rows_translated.json" --output-dir "<workspace>/amazon-review-output" python scripts/amazon_review_workbook.py prepare-tagging --input-json "<workspace>/amazon-review-output/amazon_<asin>_review_rows_translated.json" --output-dir "<workspace>/amazon-review-output" --canonical-tags-json "<workspace>/amazon-review-output/canonical_tags.json"
taxonomy-bootstrap is only for building a stable canonical vocabulary for the batch. prepare-tagging consumes the full factual or translated JSON and emits a trimmed *_tagging_input.json that contains pending rows only plus cache metadata. Do not use that trimmed file as the merge source.
- Read references/tagging-guidelines.md, let the model fill only the pending rows in a separate labels JSON, then merge the labels back into the full base JSON and build the final workbook:
python scripts/amazon_review_workbook.py merge-build --base-json "<workspace>/amazon-review-output/amazon_<asin>_review_rows_translated.json" --labels-json "<workspace>/amazon-review-output/amazon_<asin>_labels.json" --output-dir "<workspace>/amazon-review-output" --taxonomy-version "v1" --strict
Workflow
1. Verify prerequisites
- Confirm
reports a validdoctor
.asin - Confirm
ischrome_debug_ready
.true - If you plan to use
, confirmtranslate
isdeeplx_env_ready
.true - If
isdeeplx_reachable
, do not block the workflow; let the model fillfalse
during tagging.评论中文版
If any of these fail, read references/setup.md before continuing.
2. Use the smallest command that fits
- For raw review collection only: use
collect - For factual extraction plus workbook scaffolding: use
intake - For deciding whether a keyword pass is still needed: use
coverage-check - For rebuilding the tuned keyword state from historical data: use
keyword-autotune - For machine translation of
: use评论中文版translate - For canonical tag sampling: use
taxonomy-bootstrap - For cache-aware lightweight model input: use
prepare-tagging - For writing the final labeled workbook: use
merge-build
Examples:
python scripts/amazon_review_workbook.py collect --url "<amazon-url>" --output-dir "<workspace>/amazon-review-output" python scripts/amazon_review_workbook.py translate --input-json "<workspace>/amazon-review-output/amazon_<asin>_review_rows_factual.json" --output-dir "<workspace>/amazon-review-output" python scripts/amazon_review_workbook.py coverage-check --url "<amazon-url>" --db-path "<workspace>/amazon-review-output/amazon_review_cache.sqlite3" python scripts/amazon_review_workbook.py keyword-autotune --output-dir "<workspace>/amazon-review-output" --db-path "<workspace>/amazon-review-output/amazon_review_cache.sqlite3" python scripts/amazon_review_workbook.py taxonomy-bootstrap --input-json "<workspace>/amazon-review-output/amazon_<asin>_review_rows_translated.json" --output-dir "<workspace>/amazon-review-output" python scripts/amazon_review_workbook.py prepare-tagging --input-json "<workspace>/amazon-review-output/amazon_<asin>_review_rows_translated.json" --output-dir "<workspace>/amazon-review-output" --canonical-tags-json "<workspace>/amazon-review-output/canonical_tags.json" python scripts/amazon_review_workbook.py merge-build --base-json "<workspace>/amazon-review-output/amazon_<asin>_review_rows_translated.json" --labels-json "<workspace>/amazon-review-output/amazon_<asin>_labels.json" --output-dir "<workspace>/amazon-review-output" --taxonomy-version "v1" --strict
3. Keep the workbook stable
The factual and final workbooks always use the 14-column schema in references/output-schema.md.
Do not silently add or remove columns. If a field is unavailable from the page, leave it blank rather than inventing a value.
4. Tag rows only after grounding on the factual file
The model should not invent from the product page alone. Ground semantic tagging on the factual JSON/workbook created by
intake or translate.
Keep the two JSON shapes distinct:
from*_tagging_input.json
is the cropped machine prompt payload for the modelprepare-tagging
for--base-json
must be the full factual/translated record set, not the cropped tagging payloadmerge-build
is the model's completed semantic output for the pending rows only--labels-json
If
translate prints translation_mode=model_fallback, fill 评论中文版 in the same tagging pass instead of waiting for DeepLX.
Use references/tagging-guidelines.md when filling:
评论概括情感倾向类别分类标签重点标记
The preferred fast path is:
to build a canonical tag vocabulary for this batchtaxonomy-bootstrap
to create a minimal pending-row payloadprepare-tagging- model labeling only for pending rows, written into a separate labels JSON
to update cache and export the final workbook from the full base JSONmerge-build
Collection Defaults
andintake
no longer run keyword expansion implicitly incollect
mode.deep
now means the 18 combo pass only.deep- Run
after intake to compare current rows vs Amazon's visiblecoverage-check
count before deciding to spend more requests.reviews - Use
only when you explicitly want a keyword pass.--keywords - Use
with no values to run the built-in keyword preset for the selected--keywords
.--keyword-profile - Use
to provide an explicit keyword list.--keywords foo bar baz - Default pacing now inserts a
gap between combos/keywords to reduce rate-limit risk.2.5s - Built-in profiles:
: universal consumer-product termsgeneric
: universal terms + common app/setup/hardware termselectronics
: electronics profile + recording/night/parking/GPS/Wi-Fi/mount termsdashcam
- Default keyword reuse policy is
: keywords that have produced results before are skipped on later runs; recent zero-result keywords are also suppressed forsuccessful
to avoid immediate retries.72h - If you really want to brute-force rerun every keyword, use
.--keyword-reuse-scope none - A tuned state file at
is now read automatically when present, and refreshed after keyword runs so the skill gradually reorders towards higher-yield terms.<output-dir>/keyword_tuning_state.json
can also ingest old keyword-run JSON reports viakeyword-autotune
to seed the tuned state from historical experiments.--report-glob
Failure Boundaries
Do not claim success if any of these is true:
- The script did not reach a real review page.
- The expected XLSX/CSV for the current phase was not generated.
- Review links, review time, or helpful votes were guessed rather than extracted.
- The model tagged rows without first grounding on the factual JSON/workbook.
- The cropped
was used as*_tagging_input.json
for--base-json
.merge-build - The model re-labeled rows that were already cached for the same taxonomy version.
- The workflow still claims a 13-column contract after
was added as a real output column.评论用户名
Resources
- references/setup.md: first-run machine setup and environment requirements
- references/output-schema.md: fixed 14-column workbook contract
- references/tagging-guidelines.md: semantic labeling rules after factual collection
- scripts/amazon_review_workbook.py: portable CLI for doctor/collect/intake/coverage-check/keyword-autotune/translate/taxonomy-bootstrap/prepare-tagging/merge-build
- scripts/review_delivery_schema.py: workbook schema, normalization, and XLSX/CSV writer
- scripts/deeplx_translate.py: optional DeepLX translation helper
- scripts/label_workflow.py: cache, heuristics, bootstrap, and merge logic for faster labeling