Skip to content

Review Session Model Redesign

Executive Summary

The current ActiveReviewSession entity conflates two distinct concerns — capacity reservation and online presence tracking — into a single entity on the Study aggregate. This creates a confusing data model where:

  • The entity's name (ActiveReviewSession) does not reflect either of its responsibilities clearly.
  • Its lifecycle is inconsistent: it holds a capacity slot before the reviewer saves, then persists as an ambiguous presence marker afterwards, with tally correctness depending on implicit join-by-natural-key deduplication.
  • On return visits (where an AnnotationSession already exists), a new ActiveReviewSession is created but contributes nothing to capacity — it exists but is functionally inert.
  • Engagement history is lost: once a reviewer disconnects, all trace of their visit is gone (aside from the ExpiredReviewSession audit record, which is only created on timeout paths).

Objectives:

  1. Split ActiveReviewSession into two distinct entities with explicit responsibilities: SlotReservation (capacity) and ReviewerPresence (engagement).
  2. Graduate slot reservations to annotation sessions atomically via MongoDB multi-document transactions, carrying forward engagement timestamps.
  3. Store reviewer presence in its own collection (pmReviewerPresence) to preserve complete engagement history and support future admin dashboards.
  4. Align the model with QM v2's extracted AnnotationSession aggregate so the design survives migration.
  5. Eliminate tally deduplication logic — TotalAllocatedSessionCount = AnnotationSessions + SlotReservations, nothing more.

Why This Matters:

  • Clarity: Each entity has exactly one responsibility. Naming matches domain concept.
  • Correctness: Atomic graduation via transactions eliminates windows where a slot could be double-counted.
  • Admin observability: A planned admin dashboard ("which reviewers are reviewing what studies right now?") is impossible with the current model because presence is lost on disconnect. The new model makes this a first-class query.
  • Engagement analytics: Pre-save and post-save presence are qualitatively different engagement states; separating them enables meaningful metrics like "reservation-to-save conversion rate" and "time spent browsing vs. annotating".
  • QM v2 compatibility: The design works identically for migrated and unmigrated projects, with only the graduation target differing.

Problem Statement

Current model

Study (aggregate root)
├── ActiveReviewSessions[]
│   ├── InvestigatorId, StageId
│   ├── JoinedAtUtc
│   ├── StartedAnnotatingAtUtc?
│   ├── HasStartedAnnotating
│   ├── SuspendedSince?
│   ├── IdleSince?
│   ├── IdleScheduleTokenId?
│   └── Lives until: clean disconnect, idle timeout, or suspended grace period expiry
├── ExtractionInfo
│   ├── Sessions[] (AnnotationSession)
│   │   ├── CreatedAtUtc, CompletedAtUtc?
│   │   ├── Status, Reconciliation
│   │   └── Annotations, OutcomeData
│   └── SessionTallies (computed from Sessions + ActiveReviewSessions with dedup logic)

pmExpiredReviewSession collection
└── Audit records for sessions removed by timeout consumers

pmReviewSessionConnection collection
└── Per-SignalR-connection tracking for multi-tab support

Problems

1. Entity naming does not reflect responsibility. ActiveReviewSession suggests a relationship with AnnotationSession (both have "Session" in the name), but they are independent entities with no explicit reference to each other. Readers — including AI assistants — consistently misinterpret the relationship.

2. The entity has two distinct lifecycles glued together.

Phase Role
Before first save Capacity reservation — holds one of N slots on the study
After first save Ambiguous — still exists but AnnotationSession is now the slot holder; ActiveReviewSession only serves as a presence marker

The tally computation must join ActiveReviewSessions against Sessions by (InvestigatorId, StageId) to avoid double-counting the slot. This deduplication is correct but fragile: it depends on natural-key matching, and the correctness is not obvious from reading the domain model.

3. On return visits, the entity is inert. If a reviewer has an existing AnnotationSession and returns to the study, a new ActiveReviewSession is created but contributes nothing to the tally (it is deduplicated away). It exists only for operational state (idle/suspended timers). Its lifecycle is identical to its first-visit counterpart, but its purpose is different — it is no longer a reservation.

4. Engagement history is not preserved. ActiveReviewSession is removed on clean disconnect. Apart from the ExpiredReviewSession audit record (only created on timeout paths, not clean leaves), there is no record of a reviewer's engagement with a study after they disconnect. This makes the planned admin dashboard ("which project members are currently reviewing what studies, for how long?") difficult to implement — the data needed to answer historical variants of the question ("how much time did reviewer X spend on study Y last week?") is not captured.

5. Missing scheduling token for suspended state. The ActiveReviewSession stores IdleScheduleTokenId to allow cancellation of the scheduled MarkSessionIdle message, but there is no equivalent SuspendedScheduleTokenId for the RemoveSuspendedSession message. The consumer handles this via an idempotent IsSuspended guard, but the asymmetry is inconsistent.

Proposed Model

Entities

SlotReservation — on Study document

Operational, ephemeral entity that represents a capacity claim. Exists only while the reviewer is between joining and their first save.

SlotReservation (embedded on Study)
├── InvestigatorId, StageId     (natural key, unique per study)
├── ReservedAtUtc               (when reviewer first opened the study)
├── FormDirtiedAtUtc?           (when reviewer first interacted with form)
├── SuspendedSince?             (involuntary disconnect grace period)
├── IdleSince?                  (idle detection)
├── IdleScheduleTokenId?        (MassTransit token for MarkSessionIdle)
└── SuspendedScheduleTokenId?   (MassTransit token for RemoveSuspendedSession)

Lifecycle: - Created: on JoinStudyReview, only if no AnnotationSession exists for (InvestigatorId, StageId) on this study. - Not created on return visits: if the reviewer already has an AnnotationSession, the slot is held by that session and no reservation is needed. - Graduated: on first save, removed atomically within the same transaction that creates the AnnotationSession. - Removed on timeout: by MarkSessionIdle/RemoveIdleSession/RemoveSuspendedSession consumers (same as today, but operating on SlotReservation instead of ActiveReviewSession).

Operational responsibility: the reservation carries all the idle/suspend scheduling state because only pre-save presences need these controls. Post-save reviewers are not subject to idle timeout (they have committed work).

ReviewerPresence — in pmReviewerPresence collection

Engagement history entity. Captures every visit a reviewer makes to a study, preserved indefinitely for historical analysis.

ReviewerPresence (own collection)
├── Id                          (Guid)
├── StudyId, StageId, InvestigatorId, ProjectId
├── ConnectedAtUtc              (when this visit started — first SignalR connection)
├── EndedAtUtc?                 (null while current; set when last connection closes)
├── EndReason?                  (enum — see below)
├── FormDirtiedAtUtc?           (when reviewer first interacted with form during this visit)
├── AnnotationSessionId?        (set if a session was created or updated during this visit)
└── ReservationGraduatedAtUtc?  (set if this visit included the save that created the session)

EndReason enum:
├── Completed                   (clicked Save & Next, completed the review)
├── Skipped                     (clicked Skip)
├── NavigatedAway               (left the study page intentionally without completing)
├── IdleTimeout                 (idle threshold reached before any save)
├── SuspendedTimeout            (involuntary disconnect, grace period expired before reconnect)
└── SessionGraduated            (the pre-save presence ended because a save created the session;
                                 a new post-save presence was opened in its place)

Lifecycle: - Created: on JoinStudyReview when the first SignalR connection opens for this (InvestigatorId, StageId). Additional connections (tabs, devices) for the same reviewer on the same study do not create additional presence records; they attach to the existing one via ReviewSessionConnection. - Closed (not deleted): when the last connection for this presence closes, EndedAtUtc and EndReason are set. The record stays in the collection forever. - Split at graduation: when a pre-save presence reaches the save that creates the AnnotationSession, the current presence is closed (EndReason = SessionGraduated, AnnotationSessionId set), and a new presence is opened in its place (same ConnectedAtUtc behaviour — new visit from that moment, with AnnotationSessionId populated from the start).

Multi-tab handling: one presence per (InvestigatorId, StageId, continuous engagement window). Multiple tabs on the same study under the same reviewer attach to the same presence record. ReviewSessionConnection remains the per-connection entity that gates when the presence opens (first connection) and closes (last connection).

Indexes: - (StudyId, StageId, EndedAtUtc) — "who is on this study right now" (filter EndedAtUtc == null) - (InvestigatorId, EndedAtUtc) — "what is reviewer X doing right now" - (ProjectId, EndedAtUtc) — "who is currently reviewing anything in this project" - (StudyId, StageId, InvestigatorId) — engagement history per reviewer per study - (AnnotationSessionId) — reverse lookup from annotation session to presences

AnnotationSession — unchanged from current model, with two new timestamp fields

AnnotationSession
├── ... existing fields unchanged ...
├── ReservedAtUtc               (carried from SlotReservation on graduation)
└── FormDirtiedAtUtc?           (carried from SlotReservation on graduation)

After graduation, the AnnotationSession holds the full engagement timeline for the persisted review: ReservedAtUtc (joined), FormDirtiedAtUtc (started annotating), CreatedAtUtc (saved), CompletedAtUtc (marked complete).

ReviewSessionConnection — unchanged

Stays as the per-SignalR-connection entity for multi-tab tracking. Its role is unchanged: it gates when SlotReservation / ReviewerPresence state transitions fire (first connection opens, last connection closes).

Tally computation

TotalAllocatedSessionCount = AnnotationSessions.Count(stage) + SlotReservations.Count(stage)
TotalEngagedSessionCount   = AnnotationSessions.Count(stage) + SlotReservations.Count(stage, formDirty)

No deduplication. A slot is held by exactly one entity at any time: either a SlotReservation (pre-save) or an AnnotationSession (post-save). The graduation transaction guarantees no overlap.

Lifecycle

JOIN (first time, no prior AnnotationSession)
  Transaction:
    → SlotReservation created on study (ReservedAtUtc = now)
    → ReviewerPresence created (AnnotationSessionId = null)
  → ReviewSessionConnection created

JOIN (return, AnnotationSession exists)
  → No SlotReservation created (slot held by AnnotationSession)
  → ReviewerPresence created (AnnotationSessionId = existingSession.Id)
  → ReviewSessionConnection created

SECOND TAB OPENS (same reviewer, same study)
  → ReviewSessionConnection created for new tab
  → SlotReservation / ReviewerPresence: no change

FORM DIRTY (first time this visit)
  → SlotReservation.FormDirtiedAtUtc = now (if reservation exists)
  → ReviewerPresence.FormDirtiedAtUtc = now

SAVE (first time — graduation)
  Transaction:
    → SlotReservation removed from study
    → AnnotationSession created in ExtractionInfo.Sessions[] (unmigrated)
                                OR in pmAnnotationSession (migrated)
    → AnnotationSession.ReservedAtUtc = SlotReservation.ReservedAtUtc
    → AnnotationSession.FormDirtiedAtUtc = SlotReservation.FormDirtiedAtUtc
    → Current ReviewerPresence closed
        (EndedAtUtc = now, EndReason = SessionGraduated,
         AnnotationSessionId = newSession.Id,
         ReservationGraduatedAtUtc = now)
    → New ReviewerPresence created
        (ConnectedAtUtc = now, AnnotationSessionId = newSession.Id)

SAVE (subsequent)
  → AnnotationSession updated (no graduation — already happened)
  → ReviewerPresence: no change

TAB CLOSES (one of several)
  → ReviewSessionConnection removed
  → HasRemainingConnections = true → no further action

LAST TAB CLOSES (clean disconnect)
  → ReviewSessionConnection removed
  → HasRemainingConnections = false
  → ReviewerPresence closed
      (EndedAtUtc = now,
       EndReason = NavigatedAway | Completed | Skipped (determined by caller context))
  → If SlotReservation exists (never saved): SlotReservation stays; idle/suspend consumers eventually clean it up

LAST TAB CLOSES (involuntary disconnect)
  → ReviewSessionConnection removed
  → HasRemainingConnections = false
  → ReviewerPresence.SuspendedSince = now (not closed yet — grace period for reconnect)
  → If SlotReservation exists: SlotReservation.SuspendedSince = now, SuspendedScheduleTokenId set

RECONNECTION DURING GRACE PERIOD
  → ReviewSessionConnection created
  → ReviewerPresence.SuspendedSince = null (resumed)
  → SlotReservation.SuspendedSince = null (if exists)
  → Scheduled RemoveSuspendedSession cancelled via token

SUSPEND TIMEOUT (grace period expires without reconnect)
  → RemoveSuspendedSession consumer fires
  → ReviewerPresence closed (EndReason = SuspendedTimeout)
  → If SlotReservation exists: removed (slot freed)

IDLE TIMEOUT (reviewer connected but no interaction)
  → MarkSessionIdle sets SlotReservation.IdleSince
  → RemoveIdleSession consumer fires after stage-configurable timeout
  → ReviewerPresence closed (EndReason = IdleTimeout)
  → SlotReservation removed (slot freed)
  → ReviewSessionConnection cleanup by hub liveness check

QM v2 Compatibility

The QM v2 refactor (PR #2461) extracts AnnotationSession from Study.ExtractionInfo.Sessions[] into a dedicated pmAnnotationSession collection for migrated projects. The review session model works identically under both data layouts — only the graduation target differs:

using var session = await client.StartSessionAsync();
session.StartTransaction();

// Remove the SlotReservation from the Study document (always)
study.RemoveSlotReservation(investigatorId, stageId);
await _pmUnitOfWork.SaveAsync(study, session);

// Create the AnnotationSession in the appropriate location
var newSession = CreateAnnotationSession(/* ... */,
    reservedAtUtc: reservation.ReservedAtUtc,
    formDirtiedAtUtc: reservation.FormDirtiedAtUtc);

if (project.MigrationStatus == Migrated)
    await _annotationSessionRepository.CreateAsync(newSession, session);
else
    study.ExtractionInfo.AddSession(newSession);

// Close the pre-save ReviewerPresence and open the post-save one
currentPresence.Close(EndReason.SessionGraduated, newSession.Id);
var postSavePresence = new ReviewerPresence(/* ... */, annotationSessionId: newSession.Id);
await _reviewerPresenceRepository.SaveAsync(currentPresence, session);
await _reviewerPresenceRepository.CreateAsync(postSavePresence, session);

await session.CommitTransactionAsync();

All four writes (study, annotation session, presence close, presence create) succeed atomically via MongoDB multi-document transaction. No partial state is possible.

What stays identical across migration statuses

Concern Location Migration-independent
SlotReservation Study document Yes
Tally formula Study document (computed) Yes (source of session count differs but formula is the same)
ReviewerPresence pmReviewerPresence collection Yes
ReviewSessionConnection pmReviewSessionConnection collection Yes
Idle/suspend consumers Operate on SlotReservation Yes

What differs per migration status

Concern Unmigrated Migrated
Graduation target Study.ExtractionInfo.Sessions[] pmAnnotationSession
Session count source for tally Embedded Materialized count from extracted collection
AnnotationSessionId on presence Points to embedded session Id Points to extracted session Id

Admin Dashboard Enablement

The new model directly supports the planned admin dashboard for observing reviewer activity. Example queries:

Who is currently reviewing anything in this project?

db.pmReviewerPresence.find({
  projectId: <projectId>,
  endedAtUtc: null
})

What is reviewer X currently doing?

db.pmReviewerPresence.find({
  investigatorId: <investigatorId>,
  endedAtUtc: null
})

Historical engagement timeline for reviewer X on study Y:

db.pmReviewerPresence.find({
  studyId: <studyId>,
  investigatorId: <investigatorId>
}).sort({ connectedAtUtc: 1 })

Reservation-to-save conversion rate per stage (last 30 days):

db.pmReviewerPresence.aggregate([
  { $match: { connectedAtUtc: { $gte: thirtyDaysAgo }, annotationSessionId: null } },
  { $group: {
      _id: "$stageId",
      total: { $sum: 1 },
      graduated: { $sum: { $cond: [{ $eq: ["$endReason", "SessionGraduated"] }, 1, 0] } }
  } },
  { $project: { rate: { $divide: ["$graduated", "$total"] } } }
])

These queries are simple indexed reads on a dedicated collection. None of them are feasible on the current model.

Migration Strategy

This is a substantial refactor. It should not be merged into PR #2467. A phased migration is recommended:

Phase 1 — Introduce new entities alongside existing

  • Add SlotReservation entity on Study (empty list for existing studies).
  • Add pmReviewerPresence collection (empty).
  • Add new domain methods on Study (AddSlotReservation, RemoveSlotReservation, etc.) without touching existing ActiveReviewSession code.
  • Implement IReviewerPresenceRepository.
  • No behaviour change yet — just scaffolding.

Phase 2 — Dual-write during connection lifecycle

  • JoinStudyReview writes to both ActiveReviewSession (old) and SlotReservation + ReviewerPresence (new).
  • LeaveStudyReview / OnDisconnectedAsync mirror changes to both.
  • Tally computation continues to read from old entity.
  • Verify new entities match old state in staging.

Phase 3 — Switch reads to new entities

  • Tally computation reads from SlotReservation.
  • SignalR events broadcast from new entities.
  • Old ActiveReviewSession continues to be written but not read.

Phase 4 — Graduation transaction

  • SubmitSession graduates atomically via multi-document transaction.
  • Timestamps carried forward to AnnotationSession.
  • ReviewerPresence split at graduation.

Phase 5 — Remove old entities

  • Delete ActiveReviewSession, ExpiredReviewSession, and all associated code.
  • Migration script to drop Study.ActiveReviewSessions field from existing documents.
  • pmExpiredReviewSession collection archived.

Phase 6 — Admin dashboard

  • New feature built on top of the pmReviewerPresence collection.

Alternatives Considered

Alternative 1: Rename only

Rename ActiveReviewSession to something more accurate (e.g. ReviewPresence) without changing structure.

Rejected because: addresses the naming issue but not the dual-responsibility problem. Tally deduplication logic stays. Admin dashboard still infeasible. Low value for the confusion it would cause during the rename.

Alternative 2: Single unified ReviewSession entity with state machine

Replace both ActiveReviewSession and AnnotationSession with one entity holding the full lifecycle: Reserved → Annotating → Saved → Completed.

Rejected because: AnnotationSession is load-bearing across data export, reconciliation logic, screening statistics, and the QM v2 extraction. Unifying it with a presence concept would require deep changes to every consumer and is incompatible with the already-in-progress QM v2 refactor (which goes the opposite direction — more separation, not less).

Alternative 3: Keep presence on the study document instead of a dedicated collection

Store ReviewerPresences[] as an embedded array on the study, like the current ActiveReviewSessions.

Rejected because: even though the cardinality is bounded, it couples unrelated write patterns (presence events fire on every connect/disconnect, much more frequently than study mutations). It also constrains admin dashboard queries to aggregations across study documents instead of simple indexed reads on a dedicated collection. The separate collection is cleaner for the engagement-history use case.

Add a nullable AnnotationSessionId on the existing entity, populated at first save.

Rejected because: the nullable field signals that the entity has outlived its reservation role — at which point it should be destroyed and a dedicated presence entity created. The transaction pattern makes this clean split feasible; keeping a nullable link field is a half-measure that retains the dual responsibility.

Risks and Mitigations

Risk Mitigation
MongoDB multi-document transaction overhead Graduation is not a hot path (one save per reviewer per study); transaction cost is acceptable.
Transaction requires replica set Production is Atlas (replica set). Local dev containers already run as single-node replica sets for change streams. No infrastructure change needed.
pmReviewerPresence collection growth One record per visit per reviewer per study. Bounded by SessionCountTarget × visits-per-reviewer. TTL index or archival policy can be applied if long-term retention becomes an issue.
Migration complexity Six-phase migration plan with dual-write validation in staging. Each phase independently reversible.
Existing tests must be rewritten Scope is bounded (presence tests, tally tests, consumer tests). Mechanical rewrite, not logic changes.
Client-side presence state coupled to new SignalR event shape Rename SignalR events (e.g. activeReviewSessionAddedreviewerPresenceOpened) with server-side compatibility shim during transition.

Out of Scope

  • Changes to AnnotationSession beyond adding two carried-forward timestamps.
  • Changes to the QM v2 migration plan (this feature assumes QM v2 lands first or in parallel).
  • Admin dashboard UI implementation (separate feature brief, dependent on this one).
  • Changes to how ReviewSessionConnection works for multi-tab tracking.

Open Questions

  1. Should EndReason distinguish "NavigatedAway" from "Completed" based on whether the reviewer clicked Save & Next vs. the browser close button? Requires a new SignalR event from the client to disambiguate, or a heuristic based on recent save activity.
  2. Should pmReviewerPresence have a TTL index for automatic purge after (say) 2 years, or keep records indefinitely?
  3. Does the admin dashboard need real-time push updates (SignalR broadcast of presence events) or is polling the collection acceptable?
  4. Should we expose an API endpoint for reviewers to see their own historical engagement with a study?

Success Criteria

  • All existing behaviour preserved: slot reservation, idle/suspend timeouts, reconnection, multi-tab support, SignalR presence broadcasts.
  • Tally computation is a simple sum (AnnotationSessions + SlotReservations) with no deduplication logic.
  • Graduation is atomic — no state where both SlotReservation and AnnotationSession exist for the same (investigator, stage) on the same study.
  • Engagement history is queryable after a reviewer disconnects.
  • Admin can query "who is currently reviewing study X" with a single indexed read.
  • Timestamps (ReservedAtUtc, FormDirtiedAtUtc) carried forward to AnnotationSession on graduation.
  • Works identically for migrated (QM v2) and unmigrated projects.