Review Session Model Redesign¶
Executive Summary¶
The current ActiveReviewSession entity conflates two distinct concerns — capacity reservation and online presence tracking — into a single entity on the Study aggregate. This creates a confusing data model where:
- The entity's name (
ActiveReviewSession) does not reflect either of its responsibilities clearly. - Its lifecycle is inconsistent: it holds a capacity slot before the reviewer saves, then persists as an ambiguous presence marker afterwards, with tally correctness depending on implicit join-by-natural-key deduplication.
- On return visits (where an
AnnotationSessionalready exists), a newActiveReviewSessionis created but contributes nothing to capacity — it exists but is functionally inert. - Engagement history is lost: once a reviewer disconnects, all trace of their visit is gone (aside from the
ExpiredReviewSessionaudit record, which is only created on timeout paths).
Objectives:
- Split
ActiveReviewSessioninto two distinct entities with explicit responsibilities:SlotReservation(capacity) andReviewerPresence(engagement). - Graduate slot reservations to annotation sessions atomically via MongoDB multi-document transactions, carrying forward engagement timestamps.
- Store reviewer presence in its own collection (
pmReviewerPresence) to preserve complete engagement history and support future admin dashboards. - Align the model with QM v2's extracted
AnnotationSessionaggregate so the design survives migration. - Eliminate tally deduplication logic —
TotalAllocatedSessionCount = AnnotationSessions + SlotReservations, nothing more.
Why This Matters:
- Clarity: Each entity has exactly one responsibility. Naming matches domain concept.
- Correctness: Atomic graduation via transactions eliminates windows where a slot could be double-counted.
- Admin observability: A planned admin dashboard ("which reviewers are reviewing what studies right now?") is impossible with the current model because presence is lost on disconnect. The new model makes this a first-class query.
- Engagement analytics: Pre-save and post-save presence are qualitatively different engagement states; separating them enables meaningful metrics like "reservation-to-save conversion rate" and "time spent browsing vs. annotating".
- QM v2 compatibility: The design works identically for migrated and unmigrated projects, with only the graduation target differing.
Problem Statement¶
Current model¶
Study (aggregate root)
├── ActiveReviewSessions[]
│ ├── InvestigatorId, StageId
│ ├── JoinedAtUtc
│ ├── StartedAnnotatingAtUtc?
│ ├── HasStartedAnnotating
│ ├── SuspendedSince?
│ ├── IdleSince?
│ ├── IdleScheduleTokenId?
│ └── Lives until: clean disconnect, idle timeout, or suspended grace period expiry
│
├── ExtractionInfo
│ ├── Sessions[] (AnnotationSession)
│ │ ├── CreatedAtUtc, CompletedAtUtc?
│ │ ├── Status, Reconciliation
│ │ └── Annotations, OutcomeData
│ └── SessionTallies (computed from Sessions + ActiveReviewSessions with dedup logic)
pmExpiredReviewSession collection
└── Audit records for sessions removed by timeout consumers
pmReviewSessionConnection collection
└── Per-SignalR-connection tracking for multi-tab support
Problems¶
1. Entity naming does not reflect responsibility.
ActiveReviewSession suggests a relationship with AnnotationSession (both have "Session" in the name), but they are independent entities with no explicit reference to each other. Readers — including AI assistants — consistently misinterpret the relationship.
2. The entity has two distinct lifecycles glued together.
| Phase | Role |
|---|---|
| Before first save | Capacity reservation — holds one of N slots on the study |
| After first save | Ambiguous — still exists but AnnotationSession is now the slot holder; ActiveReviewSession only serves as a presence marker |
The tally computation must join ActiveReviewSessions against Sessions by (InvestigatorId, StageId) to avoid double-counting the slot. This deduplication is correct but fragile: it depends on natural-key matching, and the correctness is not obvious from reading the domain model.
3. On return visits, the entity is inert.
If a reviewer has an existing AnnotationSession and returns to the study, a new ActiveReviewSession is created but contributes nothing to the tally (it is deduplicated away). It exists only for operational state (idle/suspended timers). Its lifecycle is identical to its first-visit counterpart, but its purpose is different — it is no longer a reservation.
4. Engagement history is not preserved.
ActiveReviewSession is removed on clean disconnect. Apart from the ExpiredReviewSession audit record (only created on timeout paths, not clean leaves), there is no record of a reviewer's engagement with a study after they disconnect. This makes the planned admin dashboard ("which project members are currently reviewing what studies, for how long?") difficult to implement — the data needed to answer historical variants of the question ("how much time did reviewer X spend on study Y last week?") is not captured.
5. Missing scheduling token for suspended state.
The ActiveReviewSession stores IdleScheduleTokenId to allow cancellation of the scheduled MarkSessionIdle message, but there is no equivalent SuspendedScheduleTokenId for the RemoveSuspendedSession message. The consumer handles this via an idempotent IsSuspended guard, but the asymmetry is inconsistent.
Proposed Model¶
Entities¶
SlotReservation — on Study document¶
Operational, ephemeral entity that represents a capacity claim. Exists only while the reviewer is between joining and their first save.
SlotReservation (embedded on Study)
├── InvestigatorId, StageId (natural key, unique per study)
├── ReservedAtUtc (when reviewer first opened the study)
├── FormDirtiedAtUtc? (when reviewer first interacted with form)
├── SuspendedSince? (involuntary disconnect grace period)
├── IdleSince? (idle detection)
├── IdleScheduleTokenId? (MassTransit token for MarkSessionIdle)
└── SuspendedScheduleTokenId? (MassTransit token for RemoveSuspendedSession)
Lifecycle:
- Created: on JoinStudyReview, only if no AnnotationSession exists for (InvestigatorId, StageId) on this study.
- Not created on return visits: if the reviewer already has an AnnotationSession, the slot is held by that session and no reservation is needed.
- Graduated: on first save, removed atomically within the same transaction that creates the AnnotationSession.
- Removed on timeout: by MarkSessionIdle/RemoveIdleSession/RemoveSuspendedSession consumers (same as today, but operating on SlotReservation instead of ActiveReviewSession).
Operational responsibility: the reservation carries all the idle/suspend scheduling state because only pre-save presences need these controls. Post-save reviewers are not subject to idle timeout (they have committed work).
ReviewerPresence — in pmReviewerPresence collection¶
Engagement history entity. Captures every visit a reviewer makes to a study, preserved indefinitely for historical analysis.
ReviewerPresence (own collection)
├── Id (Guid)
├── StudyId, StageId, InvestigatorId, ProjectId
├── ConnectedAtUtc (when this visit started — first SignalR connection)
├── EndedAtUtc? (null while current; set when last connection closes)
├── EndReason? (enum — see below)
├── FormDirtiedAtUtc? (when reviewer first interacted with form during this visit)
├── AnnotationSessionId? (set if a session was created or updated during this visit)
└── ReservationGraduatedAtUtc? (set if this visit included the save that created the session)
EndReason enum:
├── Completed (clicked Save & Next, completed the review)
├── Skipped (clicked Skip)
├── NavigatedAway (left the study page intentionally without completing)
├── IdleTimeout (idle threshold reached before any save)
├── SuspendedTimeout (involuntary disconnect, grace period expired before reconnect)
└── SessionGraduated (the pre-save presence ended because a save created the session;
a new post-save presence was opened in its place)
Lifecycle:
- Created: on JoinStudyReview when the first SignalR connection opens for this (InvestigatorId, StageId). Additional connections (tabs, devices) for the same reviewer on the same study do not create additional presence records; they attach to the existing one via ReviewSessionConnection.
- Closed (not deleted): when the last connection for this presence closes, EndedAtUtc and EndReason are set. The record stays in the collection forever.
- Split at graduation: when a pre-save presence reaches the save that creates the AnnotationSession, the current presence is closed (EndReason = SessionGraduated, AnnotationSessionId set), and a new presence is opened in its place (same ConnectedAtUtc behaviour — new visit from that moment, with AnnotationSessionId populated from the start).
Multi-tab handling: one presence per (InvestigatorId, StageId, continuous engagement window). Multiple tabs on the same study under the same reviewer attach to the same presence record. ReviewSessionConnection remains the per-connection entity that gates when the presence opens (first connection) and closes (last connection).
Indexes:
- (StudyId, StageId, EndedAtUtc) — "who is on this study right now" (filter EndedAtUtc == null)
- (InvestigatorId, EndedAtUtc) — "what is reviewer X doing right now"
- (ProjectId, EndedAtUtc) — "who is currently reviewing anything in this project"
- (StudyId, StageId, InvestigatorId) — engagement history per reviewer per study
- (AnnotationSessionId) — reverse lookup from annotation session to presences
AnnotationSession — unchanged from current model, with two new timestamp fields¶
AnnotationSession
├── ... existing fields unchanged ...
├── ReservedAtUtc (carried from SlotReservation on graduation)
└── FormDirtiedAtUtc? (carried from SlotReservation on graduation)
After graduation, the AnnotationSession holds the full engagement timeline for the persisted review: ReservedAtUtc (joined), FormDirtiedAtUtc (started annotating), CreatedAtUtc (saved), CompletedAtUtc (marked complete).
ReviewSessionConnection — unchanged¶
Stays as the per-SignalR-connection entity for multi-tab tracking. Its role is unchanged: it gates when SlotReservation / ReviewerPresence state transitions fire (first connection opens, last connection closes).
Tally computation¶
TotalAllocatedSessionCount = AnnotationSessions.Count(stage) + SlotReservations.Count(stage)
TotalEngagedSessionCount = AnnotationSessions.Count(stage) + SlotReservations.Count(stage, formDirty)
No deduplication. A slot is held by exactly one entity at any time: either a SlotReservation (pre-save) or an AnnotationSession (post-save). The graduation transaction guarantees no overlap.
Lifecycle¶
JOIN (first time, no prior AnnotationSession)
Transaction:
→ SlotReservation created on study (ReservedAtUtc = now)
→ ReviewerPresence created (AnnotationSessionId = null)
→ ReviewSessionConnection created
JOIN (return, AnnotationSession exists)
→ No SlotReservation created (slot held by AnnotationSession)
→ ReviewerPresence created (AnnotationSessionId = existingSession.Id)
→ ReviewSessionConnection created
SECOND TAB OPENS (same reviewer, same study)
→ ReviewSessionConnection created for new tab
→ SlotReservation / ReviewerPresence: no change
FORM DIRTY (first time this visit)
→ SlotReservation.FormDirtiedAtUtc = now (if reservation exists)
→ ReviewerPresence.FormDirtiedAtUtc = now
SAVE (first time — graduation)
Transaction:
→ SlotReservation removed from study
→ AnnotationSession created in ExtractionInfo.Sessions[] (unmigrated)
OR in pmAnnotationSession (migrated)
→ AnnotationSession.ReservedAtUtc = SlotReservation.ReservedAtUtc
→ AnnotationSession.FormDirtiedAtUtc = SlotReservation.FormDirtiedAtUtc
→ Current ReviewerPresence closed
(EndedAtUtc = now, EndReason = SessionGraduated,
AnnotationSessionId = newSession.Id,
ReservationGraduatedAtUtc = now)
→ New ReviewerPresence created
(ConnectedAtUtc = now, AnnotationSessionId = newSession.Id)
SAVE (subsequent)
→ AnnotationSession updated (no graduation — already happened)
→ ReviewerPresence: no change
TAB CLOSES (one of several)
→ ReviewSessionConnection removed
→ HasRemainingConnections = true → no further action
LAST TAB CLOSES (clean disconnect)
→ ReviewSessionConnection removed
→ HasRemainingConnections = false
→ ReviewerPresence closed
(EndedAtUtc = now,
EndReason = NavigatedAway | Completed | Skipped (determined by caller context))
→ If SlotReservation exists (never saved): SlotReservation stays; idle/suspend consumers eventually clean it up
LAST TAB CLOSES (involuntary disconnect)
→ ReviewSessionConnection removed
→ HasRemainingConnections = false
→ ReviewerPresence.SuspendedSince = now (not closed yet — grace period for reconnect)
→ If SlotReservation exists: SlotReservation.SuspendedSince = now, SuspendedScheduleTokenId set
RECONNECTION DURING GRACE PERIOD
→ ReviewSessionConnection created
→ ReviewerPresence.SuspendedSince = null (resumed)
→ SlotReservation.SuspendedSince = null (if exists)
→ Scheduled RemoveSuspendedSession cancelled via token
SUSPEND TIMEOUT (grace period expires without reconnect)
→ RemoveSuspendedSession consumer fires
→ ReviewerPresence closed (EndReason = SuspendedTimeout)
→ If SlotReservation exists: removed (slot freed)
IDLE TIMEOUT (reviewer connected but no interaction)
→ MarkSessionIdle sets SlotReservation.IdleSince
→ RemoveIdleSession consumer fires after stage-configurable timeout
→ ReviewerPresence closed (EndReason = IdleTimeout)
→ SlotReservation removed (slot freed)
→ ReviewSessionConnection cleanup by hub liveness check
QM v2 Compatibility¶
The QM v2 refactor (PR #2461) extracts AnnotationSession from Study.ExtractionInfo.Sessions[] into a dedicated pmAnnotationSession collection for migrated projects. The review session model works identically under both data layouts — only the graduation target differs:
using var session = await client.StartSessionAsync();
session.StartTransaction();
// Remove the SlotReservation from the Study document (always)
study.RemoveSlotReservation(investigatorId, stageId);
await _pmUnitOfWork.SaveAsync(study, session);
// Create the AnnotationSession in the appropriate location
var newSession = CreateAnnotationSession(/* ... */,
reservedAtUtc: reservation.ReservedAtUtc,
formDirtiedAtUtc: reservation.FormDirtiedAtUtc);
if (project.MigrationStatus == Migrated)
await _annotationSessionRepository.CreateAsync(newSession, session);
else
study.ExtractionInfo.AddSession(newSession);
// Close the pre-save ReviewerPresence and open the post-save one
currentPresence.Close(EndReason.SessionGraduated, newSession.Id);
var postSavePresence = new ReviewerPresence(/* ... */, annotationSessionId: newSession.Id);
await _reviewerPresenceRepository.SaveAsync(currentPresence, session);
await _reviewerPresenceRepository.CreateAsync(postSavePresence, session);
await session.CommitTransactionAsync();
All four writes (study, annotation session, presence close, presence create) succeed atomically via MongoDB multi-document transaction. No partial state is possible.
What stays identical across migration statuses¶
| Concern | Location | Migration-independent |
|---|---|---|
SlotReservation |
Study document | Yes |
| Tally formula | Study document (computed) | Yes (source of session count differs but formula is the same) |
ReviewerPresence |
pmReviewerPresence collection |
Yes |
ReviewSessionConnection |
pmReviewSessionConnection collection |
Yes |
| Idle/suspend consumers | Operate on SlotReservation |
Yes |
What differs per migration status¶
| Concern | Unmigrated | Migrated |
|---|---|---|
| Graduation target | Study.ExtractionInfo.Sessions[] |
pmAnnotationSession |
| Session count source for tally | Embedded | Materialized count from extracted collection |
AnnotationSessionId on presence |
Points to embedded session Id | Points to extracted session Id |
Admin Dashboard Enablement¶
The new model directly supports the planned admin dashboard for observing reviewer activity. Example queries:
Who is currently reviewing anything in this project?
What is reviewer X currently doing?
Historical engagement timeline for reviewer X on study Y:
db.pmReviewerPresence.find({
studyId: <studyId>,
investigatorId: <investigatorId>
}).sort({ connectedAtUtc: 1 })
Reservation-to-save conversion rate per stage (last 30 days):
db.pmReviewerPresence.aggregate([
{ $match: { connectedAtUtc: { $gte: thirtyDaysAgo }, annotationSessionId: null } },
{ $group: {
_id: "$stageId",
total: { $sum: 1 },
graduated: { $sum: { $cond: [{ $eq: ["$endReason", "SessionGraduated"] }, 1, 0] } }
} },
{ $project: { rate: { $divide: ["$graduated", "$total"] } } }
])
These queries are simple indexed reads on a dedicated collection. None of them are feasible on the current model.
Migration Strategy¶
This is a substantial refactor. It should not be merged into PR #2467. A phased migration is recommended:
Phase 1 — Introduce new entities alongside existing¶
- Add
SlotReservationentity onStudy(empty list for existing studies). - Add
pmReviewerPresencecollection (empty). - Add new domain methods on
Study(AddSlotReservation,RemoveSlotReservation, etc.) without touching existingActiveReviewSessioncode. - Implement
IReviewerPresenceRepository. - No behaviour change yet — just scaffolding.
Phase 2 — Dual-write during connection lifecycle¶
JoinStudyReviewwrites to bothActiveReviewSession(old) andSlotReservation+ReviewerPresence(new).LeaveStudyReview/OnDisconnectedAsyncmirror changes to both.- Tally computation continues to read from old entity.
- Verify new entities match old state in staging.
Phase 3 — Switch reads to new entities¶
- Tally computation reads from
SlotReservation. - SignalR events broadcast from new entities.
- Old
ActiveReviewSessioncontinues to be written but not read.
Phase 4 — Graduation transaction¶
SubmitSessiongraduates atomically via multi-document transaction.- Timestamps carried forward to
AnnotationSession. ReviewerPresencesplit at graduation.
Phase 5 — Remove old entities¶
- Delete
ActiveReviewSession,ExpiredReviewSession, and all associated code. - Migration script to drop
Study.ActiveReviewSessionsfield from existing documents. pmExpiredReviewSessioncollection archived.
Phase 6 — Admin dashboard¶
- New feature built on top of the
pmReviewerPresencecollection.
Alternatives Considered¶
Alternative 1: Rename only¶
Rename ActiveReviewSession to something more accurate (e.g. ReviewPresence) without changing structure.
Rejected because: addresses the naming issue but not the dual-responsibility problem. Tally deduplication logic stays. Admin dashboard still infeasible. Low value for the confusion it would cause during the rename.
Alternative 2: Single unified ReviewSession entity with state machine¶
Replace both ActiveReviewSession and AnnotationSession with one entity holding the full lifecycle: Reserved → Annotating → Saved → Completed.
Rejected because: AnnotationSession is load-bearing across data export, reconciliation logic, screening statistics, and the QM v2 extraction. Unifying it with a presence concept would require deep changes to every consumer and is incompatible with the already-in-progress QM v2 refactor (which goes the opposite direction — more separation, not less).
Alternative 3: Keep presence on the study document instead of a dedicated collection¶
Store ReviewerPresences[] as an embedded array on the study, like the current ActiveReviewSessions.
Rejected because: even though the cardinality is bounded, it couples unrelated write patterns (presence events fire on every connect/disconnect, much more frequently than study mutations). It also constrains admin dashboard queries to aggregations across study documents instead of simple indexed reads on a dedicated collection. The separate collection is cleaner for the engagement-history use case.
Alternative 4: Keep ActiveReviewSession and add AnnotationSessionId? as a nullable link¶
Add a nullable AnnotationSessionId on the existing entity, populated at first save.
Rejected because: the nullable field signals that the entity has outlived its reservation role — at which point it should be destroyed and a dedicated presence entity created. The transaction pattern makes this clean split feasible; keeping a nullable link field is a half-measure that retains the dual responsibility.
Risks and Mitigations¶
| Risk | Mitigation |
|---|---|
| MongoDB multi-document transaction overhead | Graduation is not a hot path (one save per reviewer per study); transaction cost is acceptable. |
| Transaction requires replica set | Production is Atlas (replica set). Local dev containers already run as single-node replica sets for change streams. No infrastructure change needed. |
pmReviewerPresence collection growth |
One record per visit per reviewer per study. Bounded by SessionCountTarget × visits-per-reviewer. TTL index or archival policy can be applied if long-term retention becomes an issue. |
| Migration complexity | Six-phase migration plan with dual-write validation in staging. Each phase independently reversible. |
| Existing tests must be rewritten | Scope is bounded (presence tests, tally tests, consumer tests). Mechanical rewrite, not logic changes. |
| Client-side presence state coupled to new SignalR event shape | Rename SignalR events (e.g. activeReviewSessionAdded → reviewerPresenceOpened) with server-side compatibility shim during transition. |
Out of Scope¶
- Changes to
AnnotationSessionbeyond adding two carried-forward timestamps. - Changes to the QM v2 migration plan (this feature assumes QM v2 lands first or in parallel).
- Admin dashboard UI implementation (separate feature brief, dependent on this one).
- Changes to how
ReviewSessionConnectionworks for multi-tab tracking.
Open Questions¶
- Should
EndReasondistinguish "NavigatedAway" from "Completed" based on whether the reviewer clicked Save & Next vs. the browser close button? Requires a new SignalR event from the client to disambiguate, or a heuristic based on recent save activity. - Should
pmReviewerPresencehave a TTL index for automatic purge after (say) 2 years, or keep records indefinitely? - Does the admin dashboard need real-time push updates (SignalR broadcast of presence events) or is polling the collection acceptable?
- Should we expose an API endpoint for reviewers to see their own historical engagement with a study?
Success Criteria¶
- All existing behaviour preserved: slot reservation, idle/suspend timeouts, reconnection, multi-tab support, SignalR presence broadcasts.
- Tally computation is a simple sum (
AnnotationSessions + SlotReservations) with no deduplication logic. - Graduation is atomic — no state where both
SlotReservationandAnnotationSessionexist for the same(investigator, stage)on the same study. - Engagement history is queryable after a reviewer disconnects.
- Admin can query "who is currently reviewing study X" with a single indexed read.
- Timestamps (
ReservedAtUtc,FormDirtiedAtUtc) carried forward toAnnotationSessionon graduation. - Works identically for migrated (QM v2) and unmigrated projects.