#601

Globales Ranking · von 601 Skills

agent-auditor AI Agent Skill

Quellcode ansehen: b-open-io/prompts

Safe

Installation

npx skills add b-open-io/prompts --skill agent-auditor

10

Installationen

Agent Auditor

Systematic audit methodology for evaluating the health, quality, and consistency of agents and skills across the plugin ecosystem. Produces actionable findings with severity ratings and recommended fixes.

Audit Checklist

Every audit evaluates skills across seven dimensions. For each skill, score pass/warn/fail per dimension.

1. Scope & Invocation

Verify the invocation control fields are set correctly.

Check against the invocation matrix:

Scenario user-invocable disable-model-invocation
Default (user + Claude can invoke) omit (default true) omit (default false)
Agent-only (hidden from / menu) false omit
User-only (Claude cannot auto-invoke) omit true
Agent-only + no auto-invoke false true

Checks:

  • Does the skill require user interaction (OTP, confirmation, subjective input)? If yes, needs disable-model-invocation: true
  • Does the skill have irreversible side effects (sends money, publishes, deploys)? If yes, needs disable-model-invocation: true
  • Would a user ever type /skill-name directly? If no, needs user-invocable: false
  • Is this purely internal agent plumbing? If yes, needs user-invocable: false
  • Cross-reference: which agents list this skill in their tools: frontmatter? Does that match the intended audience?

Common failure: Skills that are agent-internal but missing user-invocable: false, cluttering the user's / menu.

2. Location & Cross-Client

  • Skill lives in the correct plugin repo (bopen-tools, bsv-skills, gemskills, 1sat-skills, product-skills, etc.)
  • Directory name matches the name field in frontmatter exactly
  • No spaces, underscores, or capitals in directory name (kebab-case only)
  • File is named exactly SKILL.md (case-sensitive)
  • No README.md inside the skill folder (all docs go in SKILL.md or references/)

3. Description Quality

The description is the single most important field -- it determines whether Claude loads the skill.

Structure: [What it does] + [When to use it] + [Key capabilities]

Checks:

  • Uses third-person format ("This skill should be used when..." not "Use when...")
  • Includes specific trigger phrases users would actually say
  • Under 1024 characters
  • No XML angle brackets (< or >)
  • Not too vague ("Helps with projects" = fail)
  • Not too technical ("Implements the X entity model" = fail)
  • Includes negative triggers if the skill is easily confused with similar skills
  • Mentions relevant file types if applicable

Test the description: Ask Claude "When would you use the [skill name] skill?" -- Claude should quote the description back accurately. If it can't, the triggers are weak.

4. Structure & Progressive Disclosure

Skills use a three-level system to minimize token usage:

  1. First level (frontmatter): Always in system prompt. Just enough to decide relevance.
  2. Second level (SKILL.md body): Loaded when skill is invoked. Core instructions.
  3. Third level (references/): Additional detail Claude navigates to as needed.

Checks:

  • SKILL.md body is under 2,000 words (ideally 1,500). Run wc -w to verify.
  • Detailed documentation lives in references/, not inline
  • No duplication between SKILL.md body and reference files
  • Scripts for deterministic tasks live in scripts/
  • Instructions are specific and actionable, not vague ("validate the data before proceeding" = fail)
  • Critical instructions appear at the top, not buried at the bottom
  • Uses bullet points and numbered lists over long prose paragraphs

5. Testing & Benchmarks

Checks:

  • Has evals/evals.json with trigger and functional test cases
  • Trigger tests: 10 should-trigger prompts + 10 should-not-trigger prompts (realistic, not contrived)
  • Functional assertions: 3-5 per eval, specific and verifiable
  • Assertions target skill-specific knowledge, not generic model capability
  • Has baseline comparison data (pass_rate vs baseline_pass_rate)
  • Delta is positive (skill helps vs hurts)

Consult references/testing-strategies.md for the full testing methodology.

6. Agent Equipment

Agents that create or modify skills should have access to the right toolkit:

Required Skill Purpose
Skill(skill-creator:skill-creator) Interactive skill creation workflow
Skill(plugin-dev:skill-development) Skill writing best practices
Skill(bopen-tools:benchmark-skills) Eval/benchmark harness
Skill(bopen-tools:agent-auditor) This audit skill

Check the agent's tools: frontmatter to verify these are listed.

7. Generative UI Awareness

If the agent's domain involves UI generation, rendering, or cross-platform output, check for generative UI readiness.

Checks:

  • Does the agent have Skill(bopen-tools:generative-ui) in tools?
  • If the agent works with React/Next.js, does it know about json-render?
  • If the agent works with React Native, does it know about @json-render/react-native?
  • If the agent produces visual assets, does it have relevant gemskills?
  • Does the agent understand when to use generative UI vs static components?

Applicable agents: designer, agent-builder, nextjs, mobile, integration-expert

Not applicable (skip this dimension): code-auditor, documentation-writer, researcher, devops, database, payments

Audit Workflow

Step 1: Enumerate & Classify (via subagent)

Delegate enumeration and classification to a subagent to keep the main context clean:

Agent(prompt: "Enumerate and classify all skills in the target plugin.

1. Run: ls skills/*/SKILL.md and count total
2. For each skill, read the YAML frontmatter and classify:
   - Type: agent-only (user-invocable: false), user-only (disable-model-invocation: true), or default
   - Plugin it belongs in
   - Which agents reference it (grep agents/*.md for Skill(name))
3. Return a table: | Skill | Type | Referenced By | Notes |

Target directory: skills/",
subagent_type: "general-purpose")

Step 2: Run Dimension Checks (via parallel subagents)

For multi-plugin audits, dispatch one subagent per plugin in parallel. For single-plugin audits, dispatch one subagent per batch of 5-10 skills:

Agent(prompt: "Audit these skills against the seven-dimension checklist:
<list of skills from Step 1>

For each skill, evaluate: Scope & Invocation, Location & Cross-Client, Description Quality, Structure, Testing, Agent Equipment, Generative UI.

Score each dimension as pass/warn/fail. Return findings in the report format.",
subagent_type: "general-purpose")

The main context receives only the formatted audit report, not raw skill file contents.

Record per dimension:

  • Pass: Meets criteria
  • Warn: Minor issue, non-blocking
  • Fail: Must fix before publishing

Step 4: Generate Report

Format findings as:

## Audit Report: [plugin-name]

### Summary
- Total skills: N
- Pass: N | Warn: N | Fail: N

### Findings

#### [skill-name]
| Dimension | Status | Notes |
|-----------|--------|-------|
| Scope & Invocation | pass/warn/fail | details |
| Location & Cross-Client | pass/warn/fail | details |
| Description Quality | pass/warn/fail | details |
| Structure | pass/warn/fail | details |
| Testing | pass/warn/fail | details |
| Agent Equipment | pass/warn/fail | details |
| Generative UI | pass/warn/fail/skip | details |

**Recommended fixes:**
1. [specific, actionable fix]

Step 5: Fix & Re-audit

Apply fixes, then re-run the audit on changed skills only. Use the evaluator-optimizer loop from references/workflow-patterns.md for iterative improvement.

Workflow Patterns

For multi-plugin audits, use parallelization -- dispatch one subagent per plugin. See references/workflow-patterns.md for:

  • Sequential audit pipeline (single plugin)
  • Parallel dispatch (multiple plugins)
  • Evaluator-optimizer loop (iterative fixes)

Testing Strategy

See references/testing-strategies.md for:

  • Trigger testing methodology (should-trigger / should-not-trigger)
  • Functional testing with evals.json assertions
  • Performance comparison (with-skill vs without-skill baselines)
  • Quantitative and qualitative metrics
  • Description optimization loops

Reference Files

File When to Consult
references/skill-quality-guide.md Writing or reviewing description, structure, and instructions
references/workflow-patterns.md Planning multi-plugin audits or iterative fix cycles
references/testing-strategies.md Creating evals, running benchmarks, measuring effectiveness

Installationen

Installationen 10
Globales Ranking #601 von 601

Sicherheitsprüfung

ath Safe
socket Safe
Warnungen: 0 Bewertung: 90
snyk Low
EU EU-Hosted Inference API

Power your AI Agents with the best open-source models.

Drop-in OpenAI-compatible API. No data leaves Europe.

Explore Inference API

GLM

GLM 5

$1.00 / $3.20

per M tokens

Kimi

Kimi K2.5

$0.60 / $2.80

per M tokens

MiniMax

MiniMax M2.5

$0.30 / $1.20

per M tokens

Qwen

Qwen3.5 122B

$0.40 / $3.00

per M tokens

So verwenden Sie diesen Skill

1

Install agent-auditor by running npx skills add b-open-io/prompts --skill agent-auditor in your project directory. Führen Sie den obigen Installationsbefehl in Ihrem Projektverzeichnis aus. Die Skill-Datei wird von GitHub heruntergeladen und in Ihrem Projekt platziert.

2

Keine Konfiguration erforderlich. Ihr KI-Agent (Claude Code, Cursor, Windsurf usw.) erkennt installierte Skills automatisch und nutzt sie als Kontext bei der Code-Generierung.

3

Der Skill verbessert das Verständnis Ihres Agenten für agent-auditor, und hilft ihm, etablierte Muster zu befolgen, häufige Fehler zu vermeiden und produktionsreifen Code zu erzeugen.

Was Sie erhalten

Skills sind Klartext-Anweisungsdateien — kein ausführbarer Code. Sie kodieren Expertenwissen über Frameworks, Sprachen oder Tools, das Ihr KI-Agent liest, um seine Ausgabe zu verbessern. Das bedeutet null Laufzeit-Overhead, keine Abhängigkeitskonflikte und volle Transparenz: Sie können jede Anweisung vor der Installation lesen und prüfen.

Kompatibilität

Dieser Skill funktioniert mit jedem KI-Coding-Agenten, der das skills.sh-Format unterstützt, einschließlich Claude Code (Anthropic), Cursor, Windsurf, Cline, Aider und anderen Tools, die projektbezogene Kontextdateien lesen. Skills sind auf Transportebene framework-agnostisch — der Inhalt bestimmt, für welche Sprache oder welches Framework er gilt.

Data sourced from the skills.sh registry and GitHub. Install counts and security audits are updated regularly.

EU Made in Europe

Chat with 100+ AI Models in one App.

Use Claude, ChatGPT, Gemini alongside with EU-Hosted Models like Deepseek, GLM-5, Kimi K2.5 and many more.

App herunterladen:

Kundensupport