Skip to content

Review Access State Overhaul

Follow-up work for PR #2467 that fixes the three issues captured in ADR-008:

  1. Reviewer navigating to a capacity-full study can see/edit the annotation form when they should not.
  2. Saved-session capacity is invisible to the surplus/lock logic.
  3. Time-based state drifts because constants are duplicated client-side.
  4. Scheduling infrastructure (Quartz) is untested outside mocks.

This plan implements the ADR. Delete when merged.


Scope

  • Extend the snapshot DTO with TotalAllocatedSessionCount, ServerTimestamp, and absolute-UTC lifecycle timestamps.
  • Wire EnsureActiveReviewSessionForDirectNavigationAsync into the REST loadStageStudy path; extend the REST response with the access result.
  • Change the SignalR JoinStudyReview hub method to return Task<JoinStudyReviewResult>; adapt the client to await it.
  • Client store: reconcile incoming snapshots by serverTimestamp (last-write-wins); delete SUSPENDED_GRACE_PERIOD_MS; derive countdowns from absolute timestamps with a 10-second conservative display buffer.
  • Add an access-denied dialog with "Go to next study" (primary) and "View study details" (secondary). Trigger on both first-join denial and reconnect-denial (rejoinFailed$).
  • Wire the saveRejected copy into a modal on 409 Conflict from submitSession; report the same 409 to Sentry as an exception.
  • Add integration tests against real in-process Quartz (RAMJobStore) for the scheduling behaviour the feature depends on.

Out of scope

  • Moving slot reservation entirely to REST (considered in ADR-008 "Alternatives"; deferred).
  • AdminOnlySessionInfo for saved-reviewer IDs (YAGNI; revisit when admin console needs it).
  • Stage-level stats KPIs around reviewer engagement (separate ticket if needed).
  • Refactoring the next-study flow to do server-side atomic assign before returning (touches a different feature flow).

Status (as of 2026-04-17)

  • Phase 1 — snapshot DTO + server timestamp
  • Phase 2 — absolute timestamps + drift elimination
  • Phase 3 — REST response carries access result (server)
  • Phase 4 — Hub method returns structured result
  • Phase 5 — Access-denied dialog (initial + rejoin paths)
  • Phase 6 — 409 submit handling (dialog + Sentry)
  • Phase 7 — Quartz scheduling integration tests (deferred to follow-up)

Phases 1-6 shipped as part of PR #2467. Phase 7 deferred — see "Future work" below.

Task breakdown

Phase 1 — Snapshot DTO + server timestamp

Server changes:

  • Add TotalAllocatedSessionCount and ServerTimestamp to StudyReviewPresenceSnapshot; populate in FromStudy(...).
  • Add absolute-UTC lifecycle fields to SlotReservationDto (SuspendedAt, ReleaseAt, IdleAt, IdleRemovalAt). Populate from domain model.
  • Update StudyReviewPresenceSnapshotTests for new fields.

Client changes:

  • Regenerate api-client.generated.ts to pick up new DTO fields.
  • Update SlotReservationDto interface in signal-r.service.ts to match.
  • presenceUpdated store method: if incoming serverTimestamp < current[studyId].serverTimestamp, discard.
  • isReviewerSurplus computed: use totalAllocatedSessionCount >= sessionCountTarget && currentUserHasNoReservation.
  • Selector tests updated (both Case A and Case B now covered).

Phase 2 — Absolute timestamps + drift elimination

Server:

  • ReviewSessionConstants.SuspendedSessionGracePeriod becomes the single source of truth. Nothing else computes + 2h — the server always stamps ReleaseAt at the point a reservation is suspended and carries it forward.
  • Handshake payload on SignalR connect: include serverUtc so the client can compute offset.

Client:

  • Delete SUSPENDED_GRACE_PERIOD_MS from study-presence.store.ts.
  • suspendedExpiresAt computed: read mySession().releaseAt directly, subtract serverOffset.
  • connectionState / isReservationExpired: drive from releaseAt with the 10-second conservative buffer (Date.now() + serverOffset >= releaseAt - 10s).
  • disconnection-banner.component.ts countdown: render releaseAt - (Date.now() + serverOffset).
  • Store tests updated to inject releaseAt absolutes and drive via synthetic clock.

Phase 3 — REST response carries access result

Server:

  • Wire EnsureActiveReviewSessionForDirectNavigationAsync into the REST path that handles direct-URL navigation. Signature change: returns (Study, AccessResult).
  • AccessResult record: { granted: bool, denialReason: DenialReason?, snapshot: StudyReviewPresenceSnapshot }. DenialReason enum: AtCapacity, StageDisabled, ReservationExpired, NotAMember, Other.
  • Extend NextStudyResponseDto (or split into a parallel DTO for the direct-URL path — decide during implementation).
  • Controller tests cover each denial reason.

Client:

  • studyGuardCanActivate reads the accessResult from the response and seeds the presence store before the component renders.
  • The store's joinStudy method accepts an optional seedResult: AccessResult so the initial paint is already correct.
  • Guard tests updated.

Phase 4 — Hub method returns a result; client awaits

Server:

  • JoinStudyReview signature changes to Task<JoinStudyReviewResult>. Capacity-full path returns { granted: false, denialReason: AtCapacity, snapshot, serverTimestamp } instead of throwing.
  • Other failure modes (stage disabled, not a member) return structured results rather than throwing.
  • Hub tests updated to assert on returned values.

Client:

  • signalR.joinStudyReview awaits the invoke promise and feeds the result to the store.
  • rejoinFailed$ becomes rejoinResult$ carrying the same JoinStudyReviewResult. Rename for clarity; the existing fail-only semantics are a subset.
  • study-presence.init.ts subscribes and routes both success and denial into the store.

Phase 5 — Access-denied dialog

  • New component AccessDeniedDialogComponent with primary/secondary actions.
  • Primary action: dispatch the same action the Next button uses on stage-review. Locate and reuse, do not duplicate.
  • Secondary action: close dialog; leave the reviewer on the page with the existing surplus-lock overlay in place.
  • Copy lives in session-messages.ts under a new accessDenied block.
  • Shown from: (a) initial joinStudy when accessResult.granted === false, (b) rejoinResult$ when granted === false, © on the 409 save-guard path (see Phase 6).
  • Store tracks which studies have shown the dialog (dedupe per-visit, same pattern as notifiedStudyIds).
  • Component tests; E2E test for the denial-on-arrival flow.

Phase 6 — 409 submit handling

Server:

  • No change to the existing atomic save-guard (SaveWithCapacityGuardAsync is correct).
  • Response body on 409 includes enough context for the Sentry report (reservationState: string, tallySnapshot).

Client:

  • commit$ effect in review-effects.ts: replace the generic warnOnErrorWithRetryOption$ for the 409 case with (a) open the access-denied dialog using the saveRejected copy, (b) call captureException on Sentry with the 409 body and the store's current state.
  • Non-409 errors still use the retry toast.
  • Effect tests cover both branches.

Phase 7 — Scheduling integration tests

  • New test project SyRF.Scheduling.IntegrationTests (or add to an existing suitable project).
  • Fixture bootstraps a minimal MassTransit + in-process Quartz (RAMJobStore) host.
  • Tests enumerated in ADR-008 Decision 3.
  • Separate smoke test (2-3 cases) using Testcontainers + SQL Server Quartz store — tagged [Integration] and excluded from the fast suite.

Verification checklist

Before merge:

  • Direct URL to a capacity-full study with zero live reviewers renders the lock overlay + access-denied dialog on first paint.
  • Direct URL to a capacity-full study with live pre-save reviewers behaves identically.
  • Reviewer who disconnects while annotating and reconnects within the grace period keeps their reservation (existing behaviour unchanged).
  • Reviewer who disconnects and reconnects after 2+ hours with the study now full: sees the access-denied dialog on reconnect, form is locked, no save attempt possible.
  • Reviewer whose 409 save-guard triggers: dialog + Sentry event fired.
  • Countdown display in the disconnection banner ticks down to match the server's actual release time within 1 second at any point in the 2-hour window.
  • Scheduling integration tests pass; CI runtime impact acceptable.

Risks

  • DTO regeneration churn. Three endpoints carrying the same shape means schema changes ripple through generated code. Mitigation: consolidate into a single @shared/dtos/StudyReviewPresenceSnapshot definition and reference it from all three; avoid drift.
  • serverTimestamp discipline. If any snapshot is sent without a timestamp, last-write-wins degrades silently. Mitigation: make ServerTimestamp non-nullable on the DTO and set it in a single projection method.
  • Quartz RAMJobStore vs production SQL Server store drift. The two schedulers have slightly different misfire semantics. Mitigation: the 2-3 SQL-Server smoke tests catch differences; treat the fast suite as the primary coverage and the smoke suite as defence-in-depth.
  • E2E fragility. The reconnection/timing tests are inherently slower and more race-prone. Mitigation: use fake timers + test hooks to skip real waits where possible; keep real-time tests to the happy-path smoke tier.

Future work

Phase 7 (Quartz scheduling integration tests) was scoped out of this PR. Recommended setup for the follow-up:

  • New test project SyRF.Scheduling.IntegrationTests (or add to SyRF.ProjectManagement.Endpoint.Tests behind a [Trait("Category", "Integration")] marker).
  • Fixture bootstraps a minimal MassTransit + in-process Quartz (RAMJobStore) host using MassTransit.Testing.ITestHarness.
  • Coverage per ADR-008 Decision 3:
  • Schedule_FiresAfterDelay — 1-3 second delays, assert consumer runs within the window.
  • Cancel_BeforeFire_PreventsDelivery — schedule, cancel, verify consumer never runs.
  • Cancel_AfterFire_IsNoOp — schedule, let fire, then cancel — verify no exception.
  • Cancel_RacingDelivery_ConsumerIsIdempotent — exercise the consumer state guards (reservation is null, already idle, etc.).
  • Reschedule_OverwritesPrevious — verify join → leave → join doesn't leak triggers.
  • Optional secondary suite (2-3 tests) using Testcontainers + SQL Server Quartz store as a smoke check of the persistence path.

The deferred work is captured here because the pattern for testing Quartz scheduling in-process is non-trivial to establish and deserves its own focused PR with full coverage, rather than being crammed alongside the access-state UX changes.

References