Claude-skill-registry entry-guidelines

General quality standards for all je-dict-1 dictionary entries. Use when creating or revising any entry type.

install

source · Clone the upstream repo

git clone https://github.com/majiayu000/claude-skill-registry

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/entry-guidelines" ~/.claude/skills/majiayu000-claude-skill-registry-entry-guidelines && rm -rf "$T"

manifest: skills/data/entry-guidelines/SKILL.md

source content

Dictionary Entry Quality Guidelines

When creating or revising dictionary entries for je-dict-1, follow these quality standards:

CRITICAL: Write Each Entry Individually

DO NOT use Python scripts or automation to mass-produce entries.

Each dictionary entry must be written individually by hand, using:

Your own linguistic knowledge

The guidelines in this skill and related skills (

verb-entry

adjective-entry

particle-entry

other-entries

vocabulary-notes

)

Careful consideration of each word's unique characteristics

Why this matters:

Each word has nuances that require individual attention
Examples must be natural and contextually appropriate
Notes should address learner-specific challenges for that word
Mass-produced entries lack the quality and depth learners need

The correct workflow:

Select a word from
```
candidate_words.json
```
or user request
Research/consider the word's usage, collocations, and common patterns
Write the entry JSON directly using the Write tool
Validate:
```
python3 build/validate.py
```
Repeat for each entry

After finishing all entries for a session:

python3 build/validate.py           # Validate all entries
python3 build/update_indexes.py     # Update indexes and sync candidates
python3 build/build_flat.py         # Rebuild website (REQUIRED for GitHub Pages)
git add entries/ docs/ *.json PROJECT_STATUS.md
git commit -m "Add N new dictionary entries"
git push

The

build_flat.py

step is critical - without it, new entries won't appear on the live site. The build uses an atomic process (builds to temp directory, then swaps) to prevent broken states if the build fails.

Never create scripts that generate entry content programmatically.

Before Creating a New Entry

IMPORTANT: Always check if an entry already exists before creating a new one.

Duplicate Definition

A word is a duplicate ONLY if BOTH the headword AND reading match exactly.

Homophones (same reading, different headword) are NOT duplicates and should have separate entries
- Example: 線香 (せんこう) and 先行 (せんこう) are different words
- Example: 橋/箸/端 (all はし) are different words
Homographs (same headword, different reading) are NOT duplicates and should have separate entries
- Example: 行く (いく) and 行く (ゆく) are different readings
- Example: 明日 (あした) and 明日 (あす) are different readings

Duplicate Check Process

Run the duplicate check script:
```
python3 build/check_duplicate.py "食べる" "たべる"
```
- If it says "OK: ... is not in the dictionary or candidates" → Safe to create entry
- If it says "DUPLICATE: ..." → SKIP this word, do NOT create a duplicate
- Informational notes about homophones/homographs do NOT block entry creation

Batch checking (optional, to plan which candidates to work on):

python3 build/check_duplicate.py --batch "食べる:たべる" "飲む:のむ" "書く:かく"

If the word was in candidate_words.json: It will be automatically removed when you run
```
python3 build/update_indexes.py
```
after creating the entry.
Only create new entries for words that pass the duplicate check.

This prevents duplicate entries and wasted effort on entries that must later be deleted.

Content Guidelines

Explain before exemplifying - Definition first, then examples
One meaning = one example minimum - Every sense needs illustration
Show grammatical connections - Always demonstrate how words connect
Prefer natural Japanese - Avoid textbook stiffness
Highlight non-obvious distinctions - Focus on what learners cannot infer from English

Consistency Guidelines

Consistent depth across similar entries - Don't over-explain one verb while under-explaining another
Consistent structure within entry types - All verbs should have same sections
Consistent terminology - Use same labels throughout (USAGE NOTES, not sometimes Notes)

Example Sentence Guidelines

See the

example-sentences

skill for complete guidelines on:

Minimum example counts per tier (5 for basic/core, 3 for general)
Progressive length requirements
Vocabulary restrictions by tier
Quality standards and formatting

Key Requirements Summary

Minimum counts: Basic/core tiers need 5 examples per sense; general tier needs 3
Progressive length: Examples should get longer from first to last
Vocabulary restrictions: Basic tier examples must use tier-appropriate vocabulary
Always include sense_numbers: Every example must specify which definition sense(s) it illustrates

Sense Numbers Requirement

Every example sentence must have a

sense_numbers

field that links it to the definition(s) it illustrates:

"examples": [
  {
    "id": "00001_word_ex1",
    "japanese": "...",
    "english": "...",
    "sense_numbers": [1]
  }
]

Rules:

Single-sense entries: Use
```
[1]
```
for all examples
Multi-sense entries: Each example must specify which sense(s) it demonstrates
Examples illustrating multiple senses: Use
```
[1, 2]
```
format
Must reference valid senses: Numbers must match
```
sense_number
```
values in definitions

The validation script checks that all examples in multi-sense entries have valid sense_numbers.

Furigana Requirements (CRITICAL)

All kanji MUST have furigana in ALL fields, including notes.

Format:

{漢字|かんじ}

This applies to:

Headwords
Example sentences
Notes field (idioms, collocations, cultural notes, etc.)
All explanatory text

Common mistakes to avoid:

✗ WRONG: 暖簾に腕押し
✓ RIGHT: {暖簾|のれん}に{腕押|うでお}し

✗ WRONG: 安堵の息をつく
✓ RIGHT: {安堵|あんど}の{息|いき}をつく

✗ WRONG: Sometimes written as 家鴨
✓ RIGHT: Sometimes written as {家鴨|あひる}

Use compound readings for jukugo:

{友達|ともだち}

not

{友|とも}{達|だち}

Verify before finalizing:

python3 build/verify_furigana.py <entry_id>

Entry Structure

Every entry must include:

```
id
```
: Format
```
{5-digit-number}_{romaji}
```
(e.g.,
```
00396_taberu
```
)
```
headword
```
: With furigana notation
```
reading
```
: Hiragana only (see Reading Format below)
```
part_of_speech
```
: Consistent terminology
```
gloss
```
: Brief English equivalent
```
definitions
```
: Array with sense_number, gloss, explanation
```
examples
```
: 2-3 minimum, with id, Japanese, English, sense_numbers, and optional notes
```
notes
```
: Usage notes, grammar patterns, common mistakes (see
```
vocabulary-notes
```
skill for formatting requirements)
```
metadata
```
: Including vocabulary_tier (always "general" for new entries), created, modified timestamps

Reading Format (CRITICAL)

All readings MUST be in hiragana, never katakana.

This applies to ALL entries, including:

Loanwords (katakana headwords like スキー, ストレージ)
Abbreviations (DM, PC, etc.)
Any word regardless of how the headword is written

Examples:

✓ CORRECT:
  headword: "スキー"
  reading: "すきー"

✓ CORRECT:
  headword: "DM"
  reading: "でぃーえむ"

✗ WRONG:
  headword: "スキー"
  reading: "スキー"  ← Katakana readings cause duplicates!

Why this matters:

Katakana readings cause duplicate entries (same word with two different reading formats)
The dictionary uses readings for indexing and deduplication
Consistent hiragana readings ensure proper sorting and lookup

Note: The long vowel mark

ー

is acceptable in hiragana readings (e.g.,

すきー

すとれーじ

) since there is no hiragana equivalent.

The validation script (

validate.py

) will report errors for entries with katakana readings.

File Placement (CRITICAL)

Entries MUST be placed in the correct numeric range directory.

The path follows this pattern:

entries/{range}/{entry_id}.json

The range directory is determined by the numeric portion of the entry ID, rounded down to the nearest 500:

IDs 00000-00499 →
```
entries/00000/
```
IDs 00500-00999 →
```
entries/00500/
```
IDs 01000-01499 →
```
entries/01000/
```
etc.

Examples

Entry

00396_taberu

→

entries/00000/00396_taberu.json

Entry

00538_aruku

→

entries/00500/00538_aruku.json

Entry

01186_mukau

→

entries/01000/01186_mukau.json

Entry

06237_fumikiru

→

entries/06000/06237_fumikiru.json

How to Get the Correct Path

ALWAYS run this command to determine the correct path before writing:

python3 build/get_entry_path.py <reading> <entry_id>

Example:

python3 build/get_entry_path.py ふみきる 06237_fumikiru
# Output: entries/06000/06237_fumikiru.json

python3 build/get_entry_path.py こうりつてき 06240_kouritsuteki
# Output: entries/06000/06240_kouritsuteki.json

The

validate.py

script checks for directory mismatches and will report errors.

Metadata Timestamps

CRITICAL: Timestamps MUST be actual current UTC time. The website converts UTC to JST (+9 hours) for display. Incorrect timestamps will show as wrong dates/times (often appearing hours or days in the future).

How to Get the Correct Timestamp

ALWAYS run this command to get the current UTC timestamp before writing each entry:

python3 build/get_timestamp.py

This outputs the current UTC time, e.g.:

2026-01-12T10:45:30Z

Copy this exact output into both

created

and

modified

fields (for new entries) or just

modified

(for revisions).

Why This Matters

The
```
Z
```
suffix means UTC (not local time, not JST)
The build script adds 9 hours to convert to JST for display
If you write
```
16:00:00Z
```
when actual UTC is
```
10:00
```
, it displays as 01:00 JST next day (wrong!)
If you write
```
10:00:00Z
```
when actual UTC is
```
10:00
```
, it displays as 19:00 JST same day (correct!)

Common Mistakes to Avoid

DO NOT guess or estimate the timestamp
DO NOT use your perception of current time - always run the script
DO NOT use round hours like
```
12:00:00Z
```
or
```
15:00:00Z
```
(these are almost certainly wrong)
DO NOT copy timestamps from other entries
DO NOT write JST time with a Z suffix (this causes 9-hour errors)

Validation

Run

python3 build/validate.py

to check for:

Future timestamps (timestamp more than 24 hours ahead of current UTC time)
Suspiciously round timestamps (exactly
```
:00:00
```
seconds, likely not from the script)

Note: The validator allows a 24-hour grace period for timestamps to accommodate CI/CD clock drift.

Vocabulary Tier Policy

All new entries must be assigned to the "general" tier.

As of January 2026, the vocabulary tier realignment is complete:

Basic tier (795 entries): Fixed - contains foundational vocabulary
Core tier (1,998 entries): Fixed - contains essential adult communication vocabulary
General tier (4,566+ entries): All other vocabulary, including all new entries

Do NOT assign new entries to basic or core tiers unless explicitly instructed by the user. The basic and core tiers have been curated to meet specific word count targets and maintain semantic group integrity.

metadata.vocabulary_tier

, always use

"general"

"metadata": {
  "vocabulary_tier": "general",
  "created": "...",
  "modified": "..."
}

Metadata Tags (REQUIRED)

All entries must have properly structured tags in

metadata.tags

. This enables search, filtering, and export functionality.

Required Tag Categories

"metadata": {
  "vocabulary_tier": "general",
  "tags": {
    "pos": ["noun"],                    // REQUIRED: Part of speech (array)
    "formality": "neutral",             // REQUIRED: formal/neutral/informal/vulgar
    "politeness": "plain",              // REQUIRED: honorific/humble/polite/plain
    "semantic": ["food"]                // REQUIRED: Semantic category (array)
  },
  "created": "...",
  "modified": "..."
}

Part of Speech (

pos

)

Valid values:

noun

verb-godan

verb-ichidan

verb-suru

verb-kuru

verb-irregular

adjective-i

adjective-na

adjective-no

adjective-taru

adverb

particle

conjunction

interjection

pronoun

counter

prefix

suffix

expression

pre-noun-adjectival

number

onomatopoeia

auxiliary

Use arrays for multi-function words:
```
["noun", "verb-suru"]
```
The array should list the most common/primary POS first

Formality

```
formal
```
: Used in formal/written contexts (敬語, 硬い表現)
```
neutral
```
: Standard usage appropriate for most contexts (default)
```
informal
```
: Casual/colloquial usage (くだけた表現)
```
vulgar
```
: Strong/offensive language (use sparingly)

Politeness (Keigo Classification)

```
honorific
```
: 尊敬語 - Elevates the subject (いらっしゃる, おっしゃる)
```
humble
```
: 謙譲語 - Lowers the speaker (申す, 参る)
```
polite
```
: 丁寧語 - General polite forms (です/ます base forms)
```
plain
```
: 普通体 - Plain/dictionary forms (default for most entries)

Semantic Categories

Choose the most appropriate category(ies) for the word's meaning:

Specific categories (use when applicable):

Time:

time-day-of-week

time-month

time-season

time-period

time-general

Nature:

animal-mammal

animal-bird

animal-fish

animal-insect

animal-general

plant-tree

plant-flower

plant-general

weather

geography

Human:

body-part

body-internal

family

person

occupation

Objects:

food

clothing

building

transportation

tool

furniture

electronics

Abstract:
```
emotion
```
,
```
color
```
,
```
number
```
,
```
direction
```
,
```
size
```
,
```
quantity
```

Actions:

movement

communication

cognition

existence

consumption

Social:
```
greeting
```
,
```
education
```
,
```
work
```
,
```
leisure
```

Fallback categories (when no specific category fits):

```
general
```
: For nouns without a specific semantic category
```
action
```
: For verbs not fitting other action categories
```
descriptive
```
: For adjectives and adverbs
```
grammatical
```
: For particles and conjunctions
```
expression
```
: For fixed expressions and interjections
```
onomatopoeia
```
: For mimetic words

Optional Tag Categories

"tags": {
  // ... required tags above ...
  "transitivity": "transitive",     // For verbs: transitive/intransitive/both
  "style": ["spoken"],              // written/spoken/literary/archaic/slang
  "domain": ["business"]            // business/academic/technical/legal/medical/etc.
}

```
transitivity
```
: Required for verbs - indicates if verb takes a direct object
```
style
```
: Use when word is strongly associated with a medium
```
domain
```
: Use when word is specialized/technical

Tag Selection Tips

Be specific when possible: Use
```
food
```
not
```
general
```
for 寿司
Multiple tags allowed: 朝ご飯 can be
```
["food", "time-period"]
```
Match the primary meaning: Tag based on the word's core meaning
Check similar entries: Ensure consistency with related words

Quality Checklist

Before finalizing any entry, verify:

File placed in correct directory (use

python3 build/get_entry_path.py <reading> <entry_id>

)

All kanji have furigana (headword, examples, AND notes)

Verify:

python3 build/verify_furigana.py <entry_id>

shows "✓ OK"

Tags are complete: pos, formality, politeness, semantic all present
Examples progress from simple to complex
At least one collocation or fixed phrase is shown
Grammar patterns are explicitly demonstrated
Notes cover common learner mistakes
Notes are properly formatted (see
```
vocabulary-notes
```
skill)
Depth matches similar entries in the dictionary
All examples have valid sense_numbers
Run
```
python3 build/validate.py
```
to catch any directory or other errors

Claude-skill-registry entry-guidelines

Dictionary Entry Quality Guidelines

CRITICAL: Write Each Entry Individually

Before Creating a New Entry

Duplicate Definition

Duplicate Check Process

Content Guidelines

Consistency Guidelines

Example Sentence Guidelines

Key Requirements Summary

Sense Numbers Requirement

Furigana Requirements (CRITICAL)

Entry Structure

Reading Format (CRITICAL)

File Placement (CRITICAL)

Examples

How to Get the Correct Path

Metadata Timestamps

How to Get the Correct Timestamp

Why This Matters

Common Mistakes to Avoid

Validation

Vocabulary Tier Policy

Metadata Tags (REQUIRED)

Required Tag Categories

Part of Speech (pos)

Formality

Politeness (Keigo Classification)

Semantic Categories

Optional Tag Categories

Tag Selection Tips

Quality Checklist

Part of Speech (
`pos`
)