Workshop 1/15 - Framing

AI for Human Rights Experts: Practical Workflows in 90 Minutes

A practical, evidence-based workshop on where AI helps, where it fails, and how to run a safe hands-on build from a UN report.

Audience: Human rights & rule of law experts Date: February 2026 Facilitator: Łukasz Szoszkiewicz
Goal 1: Understand capability curve Goal 2: Run a live PDF-to-dashboard task Goal 3: Apply governance controls

Scope assets: existing deck, `benchmark_results_1_1.yaml`, sample UN PDF, and demo media in this folder.

Workshop 2/15 - Agenda

90-Minute Flow (time-boxed)

Each block has one concrete output so participants leave with a reusable method, not just ideas.

0:00-0:05Framing: outcomes, constraints, baseline poll.
0:05-0:12Pace and capability: adoption acceleration + jagged frontier.
0:12-0:22Cybernetic teammate narrative: AI helps us code while we focus on problems.
0:22-0:32From tools to agents: benchmark trend + use-case matrix.
0:32-1:02Hands-on lab: run prompt on the provided UN PDF.
1:02-1:12Debrief: assess output quality and failure modes.
1:12-1:20Benefits vs limits for human-rights teams.
1:20-1:30Scale-up path + commitments.

Facilitation note: use the timer overlay (`T`) to keep transitions strict.

Workshop 3/15 - Pace of Adoption

Pace of AI Adoption: 100 Million Users in 2 Months

We are used to technologies spreading over long periods. From the 1790s to the 1920s, alarm clocks and window knockers coexisted for more than a century. AI diffusion now happens in months.

Alarm clock visual
1790s onward: alarm clocks spread gradually.
Window knocker visual
1920s: window knocker-ups still in use in some places.
Human-rights implication: institutions must adapt faster because public-facing AI tools now diffuse faster than legal and policy systems.

Acceleration from decades to months

Personal computers
15 years
Internet (WWW)
7-8 years
Facebook
4.5 years
ChatGPT
2 months
Milestone compared across technologies: time to reach 100 million users/households.

Sources: adoption timeline values from your original slide set; ITU Facts and Figures 2025 (17 Nov 2025); historical references listed in your deck notes (Atlas Obscura and Jeffrey Rubel).

Workshop 4/15 - Capability Shape

Jagged Technological Frontier

AI can deliver advanced output in one task and fail badly in adjacent tasks. Reliability is uneven, not linear.

What this means for legal and human rights work

  • Do not assume stable quality across tasks, languages or document types.
  • Hallucinations are features, not bugs. LLMs generate statistically probable outputs (see how).
  • Treat every output as draft evidence requiring human verification.
  • Use narrow success criteria per task (extract, classify, explain, cite).
Good fit now

Structured extraction, first-pass coding, interface prototyping, multilingual assistive drafting.

High-risk now

Final legal interpretation, high-stakes factual claims without traceability, unsupervised web actions.

Jagged frontier chart
Concept retained from your original deck and translated to operational workflow decisions.
Workshop 5/15 - Agentic Trend

Agentic AI: Task Horizon Keeps Expanding

This chart is loaded from your benchmark file when possible and falls back to embedded data if local-file loading is blocked.

Loading benchmark data...

Interpretation prompt: Which legal/research tasks become newly automatable as task horizon increases, and which remain governance-bound?

Source: `/Users/lszoszk/Desktop/HURIDOCS/AI presentation/benchmark_results_1_1.yaml` + METR methodology context.

Workshop 6/15 - Cybernetic Teammate

AI as a Cybernetic Teammate: Why This Unlocks Prototyping

Evidence from 776 professionals suggests that one human + AI can perform like a cross-functional team. For us, this means we can build tools even without traditional engineering teams.

Field experiment chart
Field experiment visual from your original deck.

AI as a Coding Teammate

  • AI can now perform many coding sub-tasks that previously blocked non-technical experts.
  • Coding becomes an execution task inside a broader human-rights problem-solving workflow.
  • Your human role shifts upward: define the problem, constraints, quality criteria, and safeguards.
  • Use AI to write and document the code - scripts are deterministic, LLMs are not.

Source: NBER Working Paper 33641 (April 2025), experiment with 776 professionals.

Workshop 7/15 - Coding as Task

Coding Is a Task. Problem Solving Is the Priority.

Transition from the previous slide: if AI acts as a cybernetic teammate, coding can be delegated while humans focus on identifying and solving higher-order human-rights problems.

Jensen Huang clip supports the “problem vs task” framing.

What humans must own

  • Problem definition: what concrete rights-related gap are we solving?
  • Data strategy: what sample data is enough to test value quickly?
  • Quality and risk checks: what counts as acceptable output? (easier to verify = better for AI)
  • Decision accountability: who approves deployment and use?

Source: local media file (`ssstwitter.com_1770631646628.mp4`) and workshop narrative integration.

Workshop 8/15 - Human Rights Matrix

Where AI Helps Human Rights Practice Today

Think in bounded workflow modules, each with an accountability owner.

Monitoring

Rapid extraction from UN reports, court decisions, and civil society submissions with triage tags.

Legal Analysis

First-pass categorization and issue spotting against rights frameworks, then expert review.

Reporting

Draft narrative structures, evidence tables, and stakeholder-facing summaries with source links.

Evidence Triage

Searchable paragraph-level repositories with thematic labels and explainable selection rationale.

Design rule

Every AI step should produce an inspectable artifact: extracted text, labels, or ranked candidates.

Control rule

For each artifact, assign a human role: reviewer, approver, or escalation owner.

Source basis: workshop synthesis from your prior deck, UN document use case, and current governance standards.

Workshop 9/15 - Hands-On Setup

Hands-On Setup (Gemini Canvas / Claude Artifacts / ChatGPT)

Recommended path: Gemini with Canvas and the best available model. Alternatives: Claude with Artifacts, or ChatGPT with the best available model.

Platform instructions

  • Option A (recommended): Gemini - open Canvas, select the best available model, upload the sample PDF, paste the prompt, generate the app.
  • Option B: Claude - turn on Artifacts, upload the sample PDF, paste the prompt, generate the app artifact.
  • Option C: ChatGPT - select the best available model, upload the sample PDF, paste the prompt, generate a single-file HTML app.

Execution constraint

Stay within a single run first. Do not iterate the prompt before capturing initial output quality.

Facilitator checklist

Source: workshop implementation requirement and provided local sample files.

Workshop 10/15 - Prompt

The Hands-On Prompt (Use This Exact Version)

Copy-paste the full prompt below, attach the sample UN PDF, and run it in Gemini Canvas (recommended) or Claude Artifacts / ChatGPT.

Recommendation: run once without edits, then do one constrained iteration only after rubric scoring.

Source: your original workshop prompt content from the existing slide deck.

Workshop 11/15 - Sample PDF

Sample File and Run Instructions

Use the provided UN report PDF as a controlled input for comparable outputs.

Sample UN report page
Sample document: A/HRC/59/50 report pages (local PDF asset).

File to attach

/Users/lszoszk/Desktop/HURIDOCS/AI presentation/a-hrc-59-50-aev-copy.pdf

Run sequence

  1. Choose one path: Gemini + Canvas (recommended), Claude + Artifacts, or ChatGPT.
  2. Attach PDF and paste prompt from previous slide.
  3. Run once with no edits and generate the app.
  4. Export/save output for group debrief.

Source: local UN sample report (`a-hrc-59-50-aev-copy.pdf`).

Workshop 12/15 - Debrief Rubric

Output Validation Rubric (10 points + risk check)

Score before improving. This separates prompt quality from model luck.

Criterion What to verify Score (0-2)
Paragraph extraction Exactly 15 paragraphs, original numbering retained.
Thematic framework 10 coherent themes, with rationale for category design.
Search behavior Keyword + thematic filter + visible highlight behavior.
UI quality Readable, responsive, and appropriate for policy/legal audiences.
Traceability Can reviewers inspect text, mapping logic, and potential errors?
Current score: 8 / 10

Risk check A

Does the output invent facts or mislabel legal meaning?

Risk check B

Could a non-expert mistake this for validated legal analysis?

Source: workshop debrief protocol (human-in-the-loop verification standard).

Workshop 13/15 - Benefits vs Limits

Benefits vs Limitations in Human Rights Practice

Adopt aggressively for speed; govern aggressively for rights, privacy, and due process.

Benefits

  • Tailor-made workflows for small teams and NGOs.
  • Fast prototyping from plain language instructions.
  • Lower barriers for legal-tech experimentation.
  • Earlier integration of human-rights-by-design principles.

Limitations and risks

  • Hallucinations and silent reasoning errors.
  • Privacy and confidentiality exposure in cloud workflows.
  • Security risks in automated browsing/action tools.
  • Maintenance burden when scaling prototypes.

Control set (minimum)

Data classification, explicit review gates, source-linked outputs, retention policy, and incident logging.

Literacy target

Every team member can explain model limits, prompt scope, and when to escalate to domain experts.

Source basis: original workshop section + updated AI governance practice.

Workshop 14/15 - Beyond 120 Seconds

If You Have More Than 120 Seconds

Scale the prototype into an offline application. Then, go online.

Workshop 15/15 - Close

Close: Commitments, Controls, Next Step

Before we leave: one commitment per participant on a real workflow to test this week.

One-week commitment

  • Identify one big human-rights problem worth solving.
  • Start a structured conversation with AI about solution options.
  • Prepare a sample dataset in conversation with AI (e.g., 100 judgments or 100 texts).
  • Draft and generate code (app) tailored to that dataset.
  • Document how much time it took and decide whether developing your own app is the way to go.


Keyboard controls

Previous / Next slide← →
Next slideSpace
FullscreenF
Notes panelN
Timer start/pauseT
Open print deckP

End state: participants leave with prompt, sample file, validation rubric, and governance checklist.

Slide 1 / 15 Target 5 min Elapsed 00:00 | Remaining 90:00