robco-forge

RobCo Forge - Deployment Guide

Overview

This guide provides instructions for deploying the complete RobCo Forge platform to production.

Architecture Overview

The RobCo Forge platform consists of:

  1. Infrastructure (Terraform + AWS CDK) - AWS resources, networking, Kubernetes
  2. API Services (Python/FastAPI) - Core API, authentication, RBAC
  3. Provisioning Service (Python) - WorkSpace lifecycle management
  4. Lucy AI Service (Python) - Anthropic Claude integration
  5. Cost Engine (Python) - Cost tracking and optimization
  6. CLI (TypeScript/Node.js) - Command-line interface
  7. Portal (Next.js/React) - Web interface

Prerequisites

Required Tools

AWS Account Setup

External Services

Phase 1: Infrastructure Deployment

1.1 Deploy Terraform Infrastructure

cd terraform/environments/production

# Initialize Terraform
terraform init

# Review plan
terraform plan -out=tfplan

# Apply infrastructure
terraform apply tfplan

This creates:

1.2 Deploy Kubernetes Resources

cd cdk

# Install dependencies
npm install

# Deploy CDK stacks
cdk deploy --all --require-approval never

This creates:

1.3 Configure Secrets

# Store secrets in AWS Secrets Manager
aws secretsmanager create-secret \
  --name forge/database \
  --secret-string '{"username":"forge","password":"<password>"}'

aws secretsmanager create-secret \
  --name forge/anthropic \
  --secret-string '{"api_key":"<anthropic-api-key>"}'

aws secretsmanager create-secret \
  --name forge/okta \
  --secret-string '{"client_id":"<okta-client-id>","client_secret":"<okta-client-secret>"}'

Phase 2: Database Setup

2.1 Run Database Migrations

cd api

# Install dependencies
pip install -r requirements.txt

# Set database URL
export DATABASE_URL="postgresql://forge:<password>@<rds-endpoint>:5432/forge"

# Run migrations
alembic upgrade head

2.2 Create Initial Data

# Create admin user
python scripts/create_admin_user.py

# Create default blueprints
python scripts/create_default_blueprints.py

Phase 3: API Services Deployment

3.1 Build Docker Images

cd api

# Build API image
docker build -t forge-api:latest .

# Push to ECR
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin <account-id>.dkr.ecr.us-east-1.amazonaws.com
docker tag forge-api:latest <account-id>.dkr.ecr.us-east-1.amazonaws.com/forge-api:latest
docker push <account-id>.dkr.ecr.us-east-1.amazonaws.com/forge-api:latest

3.2 Deploy to Kubernetes

# Apply Kubernetes manifests
kubectl apply -f k8s/api-deployment.yaml
kubectl apply -f k8s/api-service.yaml
kubectl apply -f k8s/api-ingress.yaml

# Verify deployment
kubectl get pods -n forge-api
kubectl logs -n forge-api -l app=forge-api

3.3 Configure Load Balancer

# Get load balancer URL
kubectl get ingress -n forge-api

# Configure DNS
# Point api.forge.example.com to load balancer

Phase 4: CLI Deployment

4.1 Build CLI

cd cli

# Install dependencies
npm install

# Build
npm run build

# Package
npm pack

4.2 Publish CLI

# Publish to npm (if public)
npm publish

# Or distribute binary
npm run package

4.3 Install CLI

# Install globally
npm install -g @robco/forge-cli

# Configure
forge config set api-url https://api.forge.example.com
forge config set auth-method okta

Phase 5: Portal Deployment

5.1 Build Portal

cd portal

# Install dependencies
npm install

# Set environment variables
cat > .env.production << EOF
NEXT_PUBLIC_API_URL=https://api.forge.example.com
NEXT_PUBLIC_WS_URL=wss://api.forge.example.com/ws
EOF

# Build
npm run build

5.2 Deploy Portal

# Install Vercel CLI
npm install -g vercel

# Deploy
vercel --prod

Option B: Docker

# Build Docker image
docker build -t forge-portal:latest .

# Push to ECR
docker tag forge-portal:latest <account-id>.dkr.ecr.us-east-1.amazonaws.com/forge-portal:latest
docker push <account-id>.dkr.ecr.us-east-1.amazonaws.com/forge-portal:latest

# Deploy to Kubernetes
kubectl apply -f k8s/portal-deployment.yaml

Option C: Static Export

# Build static export
npm run build
npm run export

# Deploy to S3 + CloudFront
aws s3 sync out/ s3://forge-portal-bucket/
aws cloudfront create-invalidation --distribution-id <dist-id> --paths "/*"

5.3 Configure DNS

# Point portal.forge.example.com to deployment
# Vercel: Use Vercel DNS or CNAME
# Kubernetes: Use load balancer URL
# S3: Use CloudFront distribution

Phase 6: Monitoring Setup

6.1 Configure Prometheus

# Verify Prometheus is running
kubectl get pods -n forge-system -l app=prometheus

# Access Prometheus UI
kubectl port-forward -n forge-system svc/prometheus 9090:9090

6.2 Configure Grafana

# Get Grafana admin password
kubectl get secret -n forge-system grafana-admin -o jsonpath='{.data.password}' | base64 -d

# Access Grafana UI
kubectl port-forward -n forge-system svc/grafana 3000:3000

# Import dashboards from grafana/dashboards/

6.3 Configure CloudWatch Alarms

# Create alarms for critical metrics
aws cloudwatch put-metric-alarm \
  --alarm-name forge-api-high-error-rate \
  --alarm-description "Alert when API error rate exceeds 5%" \
  --metric-name ErrorRate \
  --namespace Forge/API \
  --statistic Average \
  --period 300 \
  --threshold 5 \
  --comparison-operator GreaterThanThreshold \
  --evaluation-periods 2

Phase 7: Validation

7.1 Smoke Tests

# Test API health
curl https://api.forge.example.com/health

# Test authentication
forge login

# Test workspace provisioning
forge launch --bundle STANDARD --os Windows

# Test portal
open https://portal.forge.example.com

7.2 End-to-End Tests

cd api
pytest tests/e2e/

cd portal
npm run test:e2e

7.3 Load Testing

# Run load tests
cd api
locust -f tests/load/locustfile.py --host https://api.forge.example.com

Phase 8: Post-Deployment

8.1 Monitor Logs

# API logs
kubectl logs -n forge-api -l app=forge-api --tail=100 -f

# Portal logs (if on Kubernetes)
kubectl logs -n forge-portal -l app=forge-portal --tail=100 -f

# CloudWatch logs
aws logs tail /aws/eks/forge/cluster --follow

8.2 Monitor Metrics

8.3 User Onboarding

# Create user accounts
python scripts/create_users.py --csv users.csv

# Assign roles
python scripts/assign_roles.py --user alice@example.com --role team_lead

# Set budgets
python scripts/set_budgets.py --team engineering --amount 5000

Rollback Procedures

API Rollback

# Rollback to previous version
kubectl rollout undo deployment/forge-api -n forge-api

# Verify rollback
kubectl rollout status deployment/forge-api -n forge-api

Portal Rollback

# Vercel
vercel rollback

# Kubernetes
kubectl rollout undo deployment/forge-portal -n forge-portal

Database Rollback

# Rollback migration
cd api
alembic downgrade -1

Troubleshooting

API Not Responding

# Check pod status
kubectl get pods -n forge-api

# Check logs
kubectl logs -n forge-api -l app=forge-api

# Check database connectivity
kubectl exec -it -n forge-api <pod-name> -- python -c "from src.database import engine; engine.connect()"

Portal Not Loading

# Check build logs
npm run build

# Check environment variables
cat .env.production

# Check API connectivity
curl https://api.forge.example.com/health

WorkSpace Provisioning Failing

# Check AWS WorkSpaces service status
aws workspaces describe-workspaces

# Check IAM permissions
aws iam get-role --role-name ForgeWorkSpacesRole

# Check logs
kubectl logs -n forge-api -l app=forge-api | grep "workspace"

Security Checklist

Maintenance

Daily

Weekly

Monthly

Quarterly

Support

Documentation

Contact

Appendix

Environment Variables

API Service

DATABASE_URL=postgresql://forge:<password>@<rds-endpoint>:5432/forge
REDIS_URL=redis://<redis-endpoint>:6379
ANTHROPIC_API_KEY=<api-key>
OKTA_CLIENT_ID=<client-id>
OKTA_CLIENT_SECRET=<client-secret>
OKTA_DOMAIN=<okta-domain>
AWS_REGION=us-east-1
LOG_LEVEL=INFO

Portal

NEXT_PUBLIC_API_URL=https://api.forge.example.com
NEXT_PUBLIC_WS_URL=wss://api.forge.example.com/ws

CLI

FORGE_API_URL=https://api.forge.example.com
FORGE_AUTH_METHOD=okta

Resource Requirements

API Service

Portal

Database

FSx ONTAP