Program Overview
Scope covers facility readiness, compute/storage build, policy-aligned access, and AWS data exchange so CBSD can run combined clinical + discovery sequencing with predictable turnaround.
Mission Summary
- Converge legacy CBSD delivery items with refreshed rack, power, and network layouts.
- Guarantee dual-room resiliency: Sequencer Room (instrument + UPS) and Datacenter Room (compute, storage, data protection).
- Stand up automated transfers to Crown AWS tenant for analytics and partner sharing.
Key Highlights
- 10 Gb production VLAN4, 40 Gb data VLAN7, VLAN2 IPMI, VLAN10 work area.
- Three 500 TB tiers (primary, secondary, clinical isolation) with the data-protection bridge.
- Crown-standard TACACS+, syslog, RBAC enforcement on the management fabric.
Global Architecture
The architecture view now mirrors the provided “NGS IT Infrastructure & Data Flow” reference: Sequencing and data generation on the left, HPC and primary storage at the center, and existing infrastructure/AWS services on the right, tied together by the same color-coded fabrics.
Architecture Narrative: dual 10 Gb ingest feeds land on the sequencer switch, step across management, data, and storage switches into the five-blade HPC compute cluster, and write to the NetApp FAS2820 HA pair before Cohesity/clinical tiers replicate outward. Existing firewall + Netgear infrastructure provides policy-based routing to analysts, AWS S3/Glacier, and VM file shares without touching the protected sequencing VLANs.
Business & Data Flow
MGISeq 2 and NovaSeq 6000 capture runs, write via CIFS to the sequencer switch, and post health/control data over 1 Gb management links.
10 Gb data and 25 Gb storage fabrics (Catalyst 9200L + Nexus 9300) carry payloads into the HPC rack while IPMI/OOB stays isolated on VLAN2.
Dell PowerEdge R760xs blades execute pipelines under Slurm, writing to the NetApp HA pair and flagging workloads for discovery vs. clinical storage.
Bioinformaticians access results via the legacy Netgear core, Cohesity snapshots replicate to clinical storage + AWS, and job submissions keep the loop closed.
Design Principles & Delivery Milestones
Design Tenets
- Segregate production, control, data, and work networks with explicit ACLs.
- Prefer dual-power, dual-path connectivity for every rack component.
- Automate validation: checksum, logging, and ticketing per run.
- Cloud exchanges gated by data-protection policy + encryption.
Success Metrics
- < 6 hours ingest-to-analysis SLA for 2 TB runs.
- RPO < 1 hr via data-protection snapshots + S3 replication.
- Instrument utilization ≥ 80% with no backlog.
- Zero critical audit findings for VLAN segmentation.
Facility build, storage mirror, cluster QA, and go-live checkpoints.
Current CBSD Baseline
Facility
- Dedicated cold aisle ready; 38,000 BTU/h HVAC confirmed.
- Two 42U racks reserved; PDUs awaiting final circuit test.
Network
- Legacy core switch ports exhausted; temporary ToR installed.
- VLAN2/4/7 definitions approved; ACLs in staging.
Storage & Backup
- 500 TB landing shelves delivered; awaiting rack-in.
- Data-protection cluster licensed; policies drafted.
Process
- Runbooks updated to include AWS data escrow.
- LIMS integration in progress; IAM roles mapped.
Execution Checklist by Zone
1 · Lab & Sequencer Room
- Install dual 10 Gb ToR with VLAN4 trunks + VLAN2 OOB.
- Provide UPS-backed receptacles (2x30A) for each sequencer.
- Patch fiber pairs to datacenter core, label per Crown standard.
- Calibrate humidity + particle sensors feeding facility BMS.
2 · Datacenter & Racks
- Rack two management controllers, five compute nodes, three storage shelves.
- Deploy dual PDUs per rack; map to UPS + utility.
- Terminate protection + AWS gateways in upper U for airflow.
- Connect TACACS+/syslog uplinks to security core.
3 · Physical Cabling
- Fiber trunk: Sequencer room → Datacenter (OM4, 12-core).
- Cat6A patch for VLAN10 work area + console runs.
- Label 40 Gb QSFP28 runs for storage mesh, include spare.
- Document cross-connect schedule in NetBox.
4 · Compute, Storage, Cloud
- Install RHEL 9 + Slurm on cluster; configure container runtime.
- Carve GPFS pools (500 TB primary/secondary/clinical) with QoS tiers.
- Configure the data-protection platform to replicate to AWS S3 + Glacier; test Snowball fallback.
- Publish user access via Okta + Crown RBAC; map VPN/DX routes.
5 · User & Remote Access
- Provision analyst pods with VLAN10 + secure jump hosts.
- Enable remote monitoring (Grafana, Splunk) with RBAC roles.
- Implement service catalog for job submission + data pulls.
- Train support teams on escalation + DR exercises.
Excel rack map: 42U positions from CBSD-NGS&HPC-Cost Estimation.xlsx are mirrored verbatim so facilities can cross-check onsite labeling.
| U | Rack 01 · Sequencer / Storage | Rack 02 · Compute / Fabric |
|---|---|---|
| 42 | — | — |
| 41 | — | — |
| 40 | — | — |
| 39 | Cisco C9200L (Mgmt) | Cisco 9300 (25 Gb) |
| 38 | — | — |
| 37 | — | — |
| 36 | Cisco C9200L (Data) | Cisco 9300 (10 Gb) |
| 35 | — | — |
| 34 | — | — |
| 33 | — | — |
| 32 | — | — |
| 31 | — | — |
| 30 | — | — |
| 29 | — | — |
| 28 | — | Dell R760xs |
| 27 | — | — |
| 26 | — | — |
| 25 | — | Dell R760xs |
| 24 | — | — |
| 23 | — | — |
| 22 | PDU | PDU |
| 21 | — | — |
| 20 | — | Dell R760xs |
| 19 | — | — |
| 18 | — | — |
| 17 | NetApp FAS2820 | Dell R760xs |
| 16 | — | — |
| 15 | — | — |
| 14 | — | Dell R760xs |
| 13 | — | — |
| 12 | — | — |
| 11 | — | — |
| 10 | — | — |
| 9 | — | — |
| 8 | — | — |
| 7 | NetApp FAS2820 | NetApp FAS2820 |
| 6 | — | — |
| 5 | — | — |
| 4 | — | — |
| 3 | — | — |
| 2 | — | — |
| 1 | PDU | PDU |
Network & VLAN Plan
| VLAN | Name / Purpose | IP Space | Bandwidth | Notes |
|---|---|---|---|---|
| 2 | IPMI / OOB | 172.23.64.0/24 | 1 Gb | Management BMCs, UPS, PDUs, storage controllers, TACACS+. |
| 4 | 10 Gb Production | 172.23.63.0/24 | 10 Gb | Sequencers, interface servers, compute entry, AWS gateway. |
| 5 | Core Services | 172.23.62.0/24 | 10 Gb | Management appliances, jump hosts, automation. |
| 7 | 40 Gb Storage | 10.0.6.0/24 | 40 Gb | GPFS fabric for three 500 TB tiers, data-protection ingest. |
| 10 | Work Area | 172.23.70.0/24 | 1 Gb | Analyst workstations, remote admin, LIMS terminals. |
Infrastructure Detail
| Domain | Components | Notes |
|---|---|---|
| Servers | 2 × management controllers, 5 × compute nodes, 2 × interface servers | RHEL 9 + Slurm, container stack (Singularity/Apptainer), Ansible automation. |
| Storage | 3 × 500 TB landing tiers, data-protection cluster | Primary & secondary synchronous; clinical tier isolated but policy-visible. |
| Network | 10 Gb ToR (dual), 40 Gb spine, enterprise work access switches, VPN/DX edge | ACL + QoS on the management gateway; NetBox source of truth. |
| UPS & Power | 2 × 30 kVA UPS, dual PDUs/rack, environmental sensors | N+1, SNMP exports to facility BMS, monthly battery test. |
| OS & Platform | RHEL 9, Slurm 23, GPFS, data-protection suite, AWS CLI / Snowball Edge | Integrated logging to Splunk, monitoring via Grafana/Prometheus. |
| Access & Security | Okta SSO, Crown RBAC, TACACS+, Syslog, Vault secrets, MFA VPN | Runbooks for least-privilege job submission + data sharing. |
Bill of Materials & Budget (USD)
Figures below are transcribed directly from CBSD-NGS&HPC-Cost Estimation.xlsx so the webpage matches the latest excel-based sourcing package.
| Item | Brand / Model | Qty | Unit Price | Subtotal | Notes |
|---|---|---|---|---|---|
| Server & Storage | |||||
| Analysis Servers | Dell PowerEdge R760xs | 5 | $17,000 | $85,000 | 2× Intel Gold 6526Y, 8×64 GB RDIMM, Broadcom 57414 10/25 Gb. |
| NGS Storage (HA) | NetApp FAS2820 Dual Controller | 2 | $150,000 | $300,000 | 48×22 TB, 8×25 Gb SFP28, SnapMirror for 500 TB usable. |
| NAS / Clinical Storage | NetApp FAS2820 Dual Controller | 1 | $150,000 | $150,000 | Isolated CIFS share for clinical workloads. |
| Precision Workstations | Dell Precision 3680 Tower | 2 | $4,300 | $8,600 | i7-14700, 2×32 GB, 2×2 TB SSD RAID1, Win 11 Pro. |
| OS & Platform Stack | RHEL 9, Slurm 23, GPFS, data-protection suite | 5 | $799 | $3,995 | Subscription incl. AWS CLI / Snowball Edge tooling. |
| Network & Optics | |||||
| Core Fabric Switch | Cisco N9K-C93108TC-FX3 | 1 | $25,000 | $25,000 | 48×1/10G-T, 6×40/100 G QSFP28. |
| 25 Gb Fabric Switch | Cisco N9K-C93180YC-FX3 | 1 | $30,000 | $30,000 | 48×1/10/25 Gb SFP28, 6×40/100 G QSFP28. |
| Mgmt Access Switch | Cisco C9200L-48T-4X-E | 1 | $5,500 | $5,500 | 48×1 Gb + 4×10 Gb uplinks, Network Essentials. |
| Data Access Switch | Cisco C9200L-48T-4X-E | 1 | $5,500 | $5,500 | Dedicated for sequencer/data VLANs. |
| QSFP Active Optical Cable | QSFP-100G-AOC5M | 2 | $1,500 | $3,000 | 5 m AOC for fabric interconnect. |
| 25 Gb SFP28 Optics | Cisco SFP-25G-SR-S | 18 | $450 | $8,100 | NetApp storage uplinks. |
| 10 Gb SFP+ Optics | Cisco MA-SFP-10GB-SR | 4 | $300 | $1,200 | Clinical storage + legacy tie-ins. |
| Fiber Patch Kit · 10 m | LC-LC, MM, OM4 | 15 | $200 | $3,000 | NetApp storage & clinical storage. |
| Fiber Patch Kit · 5 m | LC-LC, MM, OM4 | 15 | $150 | $2,250 | Short intra-rack jumpers. |
| Cat6A Copper Cables | 10 m factory bundle | 10 | $0 | $0 | Sequencer kit inclusion. |
| Cat6 Copper Cables | 10 m (Data & IPMI) | 10 | $0 | $0 | Bundled with switch purchase. |
| Cat6 Copper Cables | 5 m (Rack jumpers) | 10 | $0 | $0 | No-charge accessory. |
| Accessory Materials | Cable trays, labels, hardware | 1 | $1,000 | $1,000 | Structured cabling labor kit. |
| Facility & Power | |||||
| Dedicated Cold Aisle | HVAC / 38,000 BTU/h | 1 | $0 | $0 | Already provisioned; no uplift. |
| 42U Racks + Dual PDUs | Standard rack cabinet | 2 | $2,010 | $4,020 | Includes blanking + monitoring. |
| UPS · Datacenter | Schneider Easy UPS 3M 100 kVA | 1 | $51,000 | $51,000 | Primary HPC/Storage power. |
| UPS · Sequencer | Schneider Easy UPS E3S 30 kVA | 2 | $15,000 | $30,000 | Dedicated to MGISeq & NovaSeq. |
| Structured Cabling | Network link works | 1 | $0 | $0 | Included in facility scope. |
| Services & Support | |||||
| Remote Implementation Service | Dongke Service | 1 | $12,000 | $12,000 | Architecture, WAN/HPC design, validation. |
| On-site Service | Deployment block | 1 | $5,000 | $5,000 | Hands-on install & handoff. |
| Annual Maintenance | Dongke Support | 1 | $6,000 | $6,000 | Remote monitoring + updates. |
| Total Program Estimate | $740,165 | Matches Excel overview tab (Server/Storage + Network + Facility + Support). | |||
Investment & Support Model (USD)
| Category | Included Items | Estimated Cost | Notes |
|---|---|---|---|
| Server & Storage Build | 5× Dell R760xs, 3× NetApp FAS2820 stacks, Precision consoles, OS / platform subs | $547,595 | Direct pull from “Server&Storage” sheet. |
| Network & Optics | Nexus 9300 pair, dual Catalyst 9200L, optics kits, OM4 fibers, accessory materials | $84,550 | Covers switching plus all transceivers & patching. |
| Facilities & Power | Dedicated cold aisle, dual 42U racks + PDUs, Easy UPS 3M 100 kVA, Easy UPS E3S 30 kVA, structured cabling | $85,020 | Matches “Facility” sheet (zero-cost lines retained). |
| Services & Support | Dongke remote implementation, on-site block, annual maintenance | $23,000 | Breakout equals “Support Service” tab. |
| Total Program Estimate | Server/Storage + Network + Facility + Support | $740,165 | Aligns with Excel “Overview” summary. |
Plan · Quote · Build · Run
Engagement spans four stages with iterative checkpoints to keep stakeholders aligned.
Validate requirements, confirm data growth, lock facility intake, finalize risk log.
Issue BoM + services SOW, align CrownBio + vendor timelines, secure approvals.
Rack/stack, configure networks, deploy cluster/storage, execute integration tests.
Cutover sequencers, monitor SLAs, optimize workloads, review quarterly with business.