E001 SafeResume Experiment

A structured validation experiment for SLAPS across GPT-4, Claude, and Gemini.

E001 records a three-group controlled experiment, 10 test scenarios, reported quantitative results, and the evaluation shift from form-based matching to function-based assessment.

Archive Header

Show metadata

document_type: method_evidence_note
title: E001 SafeResume Experiment
date: 2025-05-19
language: en
author: Wang Xiao
source_layer: OathAI public site / SLAPS evidence layer
status: public_evidence_note
canonical_route: /evidence/e001-saferesume
source_url: https://github.com/wangxiao8600/oathai-anchorage-archive/tree/main/modules/slaps/E001_SafeResume_V1
intended_use: Read this page as a public evidence note for E001 SafeResume, preserving its role as method-layer evidence for SLAPS while linking to the public archive module, report, and design document.
not_for: Do not read this page as external certification, legal proof, universal safety validation, commercial product maturity proof, platform audit, or proof of AI consciousness.
key_terms: E001, SafeResume, SLAPS, cross-platform consistency, capsule, snapshot, method-layer evidence
related_pages: /slaps-engine, /whitepapers, /evidence-matrix, /anchor-declaration

Result

E001 SafeResume is the strongest currently public quantified validation line for SLAPS inside the OathAI archive.

It tested whether a structured SLAPS capsule could preserve selected behavior-boundary and state-restoration properties across GPT-4, Claude, and Gemini. The public archive module reports 100% behavioral consistency for the SLAPS group across all tested platforms, while control groups showed platform differences up to 81.82 percentage points.

E001 moves SLAPS from concept-only framing into a testable protocol-mechanism line.

Evidence

Experiment identity

E001_SafeResume_V1, version 6.0.0, dated 2025-05-19.

Platforms

GPT-4, Claude, and Gemini.

Control design

SLAPS capsule group, strong control group, and weak control group.

Test scope

10 test scenarios covering boundary control, compliance, state recovery, continuity, and normal function checks.

Cross-platform consistency

The public README reports 100% behavior consistency for the SLAPS group across GPT-4, Claude, and Gemini. The weak control group varied from 9.09% on GPT-4 to 90.91% on Claude and 81.82% on Gemini, producing a reported maximum platform difference of 81.82 percentage points.

State restoration and structure preservation

The public materials report 100% functional state recovery across all tested platforms for the SLAPS group. They also record a corrected strong-control comparison: after moving from form-based matching to function-based evaluation, one GPT-4 strong-control cross-task structural preservation result was corrected to 0%.

Boundary control

The public archive reports 100% boundary-control success for the SLAPS group and 9.09% for the GPT-4 weak-control case. This is evidence for selected tested boundary behavior, not a universal safety claim.

Evaluation revision

E001 went through six design iterations and shifted from form-based evaluation to function-based evaluation. That revision is part of the evidence because it shows the experiment adapting its measurement method to the actual claim: structure preservation rather than surface-form repetition.

Interpretation

Within the OathAI archive, E001 supports SLAPS as method-layer evidence. It shows that the framework had entered a measurable protocol-mechanism stage: hypothesis, comparison groups, cross-platform execution, reported quantitative results, and evaluation-method revision.

The central interpretation is not that every AI behavior can be standardized, nor that a model has persistent subject continuity. The central interpretation is narrower and stronger: selected behavior-boundary and state-restoration properties were tested through a structured capsule design, and the SLAPS group produced materially more consistent reported results than the control groups in that test scope.

This makes E001 a public evidence anchor for the SLAPS claim that external structure can carry continuity more reliably than prompt-only or weak assistant configuration approaches in selected scenarios.

Boundary

E001 should not be read as external certification, legal proof, universal safety validation, commercial readiness proof, platform audit, or proof of AI consciousness.

It does not certify GPT-4, Claude, Gemini, SLAPS, OathAI, or any downstream product. It does not prove that AI systems possess memory, emotion, selfhood, legal personhood, or persistent subject continuity.

The numbers on this page are public archive results within a selected experiment design. They should be read with the experiment scope, control design, and evaluation-method revision attached.

Public Archive Links

E001 archive module

Public README and experiment overview.

E001 experiment report

Public archive PDF.

E001 design document

Public design document PDF.

SLAPS Engine context

Runtime lineage and method-layer position.

Suggested Citation

Suggested citation: Wang Xiao, “E001 SafeResume Experiment,” OathAI Anchorage, https://oathai.io/evidence/e001-saferesume.