Program Overview
Scope covers facility readiness, compute/storage build, policy-aligned access, and AWS data exchange so CBSD can run combined clinical + discovery sequencing with predictable turnaround.
Mission Summary
- Converge legacy CBSD delivery items with refreshed rack, power, and network layouts.
- Guarantee dual-room resiliency: Sequencer Room (instrument + UPS) and Datacenter Room (compute, storage, data protection).
- Stand up automated transfers to Crown AWS tenant for analytics and partner sharing.
Key Highlights
- 10 Gb production VLAN4, 40 Gb data VLAN7, VLAN2 IPMI, VLAN10 work area.
- Three 500 TB tiers (primary, secondary, clinical isolation) with the data-protection bridge.
- Crown-standard TACACS+, syslog, RBAC enforcement on the management fabric.
Global Architecture
The layout joins user entry, instrument controls, compute/storage, and cloud egress into a deterministic flow. SVG diagram highlights rooms, critical servers, storage tiers, users, and AWS sharing.
Architecture Narrative: instruments land raw BCL/FASTQ on VLAN4, management policies validate transfers and promote data into the 40 Gb landing tiers, the data-protection layer bridges to AWS S3/Glacier, and analysts or automation desks consume through VLAN10, TACACS+-controlled services, and VPN/Direct Connect.
Business & Data Flow
Work Area clients push manifests and reagent QC to sequencers; the management plane enforces IAM + ACL before every run window.
Sequencers stream over dual 10 Gb links while 1 Gb control interfaces post telemetry, health, and job metadata.
500 TB tiers absorb payloads, execute checksum validation, snapshot to the data-protection platform, and flag clinical vs discovery buckets.
Five-node compute cluster executes WGS/WTS, variant calling, and reporting using Slurm + containerized workflows.
Automation plane packages signed outputs, syncs to AWS (DX/VPN), and notifies downstream LIMS + partners.
Design Principles & Delivery Milestones
Design Tenets
- Segregate production, control, data, and work networks with explicit ACLs.
- Prefer dual-power, dual-path connectivity for every rack component.
- Automate validation: checksum, logging, and ticketing per run.
- Cloud exchanges gated by data-protection policy + encryption.
Success Metrics
- < 6 hours ingest-to-analysis SLA for 2 TB runs.
- RPO < 1 hr via data-protection snapshots + S3 replication.
- Instrument utilization ≥ 80% with no backlog.
- Zero critical audit findings for VLAN segmentation.
Power, cooling, VLAN4/7 provisioning, UPS install complete.
Primary/secondary tiers mirrored, clinical tier fenced, backup policy validated.
Run synthetic datasets, confirm IAM, syslog, and automation hooks.
Sequencers switch to new fabric, AWS sharing enabled, operations handoff.
Current CBSD Baseline
Facility
- Dedicated cold aisle ready; 38,000 BTU/h HVAC confirmed.
- Two 42U racks reserved; PDUs awaiting final circuit test.
Network
- Legacy core switch ports exhausted; temporary ToR installed.
- VLAN2/4/7 definitions approved; ACLs in staging.
Storage & Backup
- 500 TB landing shelves delivered; awaiting rack-in.
- Data-protection cluster licensed; policies drafted.
Process
- Runbooks updated to include AWS data escrow.
- LIMS integration in progress; IAM roles mapped.
Execution Checklist by Zone
1 · Lab & Sequencer Room
- Install dual 10 Gb ToR with VLAN4 trunks + VLAN2 OOB.
- Provide UPS-backed receptacles (2x30A) for each sequencer.
- Patch fiber pairs to datacenter core, label per Crown standard.
- Calibrate humidity + particle sensors feeding facility BMS.
2 · Datacenter & Racks
- Rack two management controllers, five compute nodes, three storage shelves.
- Deploy dual PDUs per rack; map to UPS + utility.
- Terminate protection + AWS gateways in upper U for airflow.
- Connect TACACS+/syslog uplinks to security core.
3 · Physical Cabling
- Fiber trunk: Sequencer room → Datacenter (OM4, 12-core).
- Cat6A patch for VLAN10 work area + console runs.
- Label 40 Gb QSFP28 runs for storage mesh, include spare.
- Document cross-connect schedule in NetBox.
4 · Compute, Storage, Cloud
- Install RHEL 9 + Slurm on cluster; configure container runtime.
- Carve GPFS pools (500 TB primary/secondary/clinical) with QoS tiers.
- Configure the data-protection platform to replicate to AWS S3 + Glacier; test Snowball fallback.
- Publish user access via Okta + Crown RBAC; map VPN/DX routes.
5 · User & Remote Access
- Provision analyst pods with VLAN10 + secure jump hosts.
- Enable remote monitoring (Grafana, Splunk) with RBAC roles.
- Implement service catalog for job submission + data pulls.
- Train support teams on escalation + DR exercises.
Network & VLAN Plan
| VLAN | Name / Purpose | IP Space | Bandwidth | Notes |
|---|---|---|---|---|
| 2 | IPMI / OOB | 172.23.64.0/24 | 1 Gb | Management BMCs, UPS, PDUs, storage controllers, TACACS+. |
| 4 | 10 Gb Production | 172.23.63.0/24 | 10 Gb | Sequencers, interface servers, compute entry, AWS gateway. |
| 5 | Core Services | 172.23.62.0/24 | 10 Gb | Management appliances, jump hosts, automation. |
| 7 | 40 Gb Storage | 10.0.6.0/24 | 40 Gb | GPFS fabric for three 500 TB tiers, data-protection ingest. |
| 10 | Work Area | 172.23.70.0/24 | 1 Gb | Analyst workstations, remote admin, LIMS terminals. |
Infrastructure Detail
| Domain | Components | Notes |
|---|---|---|
| Servers | 2 × management controllers, 5 × compute nodes, 2 × interface servers | RHEL 9 + Slurm, container stack (Singularity/Apptainer), Ansible automation. |
| Storage | 3 × 500 TB landing tiers, data-protection cluster | Primary & secondary synchronous; clinical tier isolated but policy-visible. |
| Network | 10 Gb ToR (dual), 40 Gb spine, enterprise work access switches, VPN/DX edge | ACL + QoS on the management gateway; NetBox source of truth. |
| UPS & Power | 2 × 30 kVA UPS, dual PDUs/rack, environmental sensors | N+1, SNMP exports to facility BMS, monthly battery test. |
| OS & Platform | RHEL 9, Slurm 23, GPFS, data-protection suite, AWS CLI / Snowball Edge | Integrated logging to Splunk, monitoring via Grafana/Prometheus. |
| Access & Security | Okta SSO, Crown RBAC, TACACS+, Syslog, Vault secrets, MFA VPN | Runbooks for least-privilege job submission + data sharing. |
Investment & Support Model (USD)
| Category | Included Items | Estimated Cost | Notes |
|---|---|---|---|
| One-Time Hardware | Racks, UPS, management controllers, compute nodes, 3 × 500 TB tiers, ToR switches, data-protection platform | $1.35M | Based on US list pricing with 18% Crown discount. |
| Implementation | Install, cabling, config, validation, documentation | $280K | Includes vendor pro services + Crown engineering. |
| Software & Licenses | RHEL, Slurm support, data-protection suite, GPFS, monitoring stack | $160K | Year 1 subscription + support. |
| Ongoing Support | Managed operations, 24/7 monitoring, AWS egress | $42K / mo | Covers staff, spares, cloud storage/transfer fees. |
Plan · Quote · Build · Run
Engagement spans four stages with iterative checkpoints to keep stakeholders aligned.
Validate requirements, confirm data growth, lock facility intake, finalize risk log.
Issue BoM + services SOW, align CrownBio + vendor timelines, secure approvals.
Rack/stack, configure networks, deploy cluster/storage, execute integration tests.
Cutover sequencers, monitor SLAs, optimize workloads, review quarterly with business.