Case Study

Northstar Platform

A production-grade, open-source service-marketplace backend — built end-to-end with enterprise-style architecture, security, testing discipline, and deployment readiness. The code is public.

January 1, 2024

·Arman Hazrati

BackendNestJSTypeScriptPostgreSQLRedisSecurityDevOpsArchitecture

Explore the architecture

Modular monolith: one deploy, clear module boundaries, durable async work.

A production-grade backend system demonstrating how I design, secure, test, and ship real platforms.

View Source Code →

Overview

Northstar is a service marketplace backend built to production standards. This is not a tutorial project or proof-of-concept—it is a complete, deployable system with enterprise-grade architecture, security, and testing.

The codebase exists specifically for technical evaluation. Every architectural decision, security implementation, and test case is inspectable.

Problem Statement

Modern service marketplaces require infrastructure beyond basic CRUD operations:

Multi-role authorization — Different access levels for customers, providers, staff, and administrators
Complex workflows — Service requests move through multiple states with validation at each transition
Background processing — Notifications, cleanup tasks, and async operations
Audit requirements — Complete traceability for compliance
Horizontal scalability — Architecture that scales without rewrites

Northstar addresses these requirements with a clean, modular architecture.

Architecture

System Design

The interactive map above traces a request end to end: the NestJS API gateway authenticates and validates, a centralized RBAC guard authorizes, domain services run the business logic, and durable work is handed to a BullMQ queue. Use the Security path and Deployment layers to see how authorization and the runtime topology fit together.

Key Decisions

Layered Architecture — Controllers handle HTTP concerns only. Services contain business logic. Prisma handles data access. No leaky abstractions.

Event-Driven Patterns — Domain events decouple workflows. A service request status change triggers notifications without tight coupling.

Repository Pattern — Prisma provides type-safe queries. Data access is abstracted, testable, and replaceable.

DTO Validation — Every API input is validated through class-validator DTOs. Invalid requests never reach business logic.

Security Implementation

Security is foundational, not bolted on.

Authentication & Authorization

Layer	Implementation
Identity	JWT with refresh token rotation
Authorization	RBAC with 4 distinct roles
Password Storage	bcrypt (cost factor 10)
Session Management	Stateless with token blacklisting

Role-Based Access Control

Four roles, each scoped to the smallest set of capabilities it needs:

Role	Capabilities
Admin	Full system configuration, user management, audit-log access, all resources
Business (Provider)	Respond to service requests, manage business profile, view own responses
Staff	Review incoming requests, manage request workflows, access operational reports
Customer	Create service requests, manage own requests, view request history

API Security

Rate limiting (100 req/min per client)
Helmet security headers
CORS configuration
Input sanitization
Audit logging for sensitive operations

Data Model

Core Entities

User — Identity with role assignment and status management

ServiceRequest — Primary business entity with workflow states:

DRAFT → SUBMITTED → IN_REVIEW → ACCEPTED → IN_PROGRESS → COMPLETED
CANCELLED (terminal state from any active state)

ProviderResponse — Quotes and proposals from business users

AuditLog — Immutable record of system events with metadata

Database Design

12+ strategic indexes for query performance
Soft deletes for data recovery
JSON fields for flexible metadata
Proper foreign key constraints
Cascading delete rules

Background Processing

Job Queue Architecture

BullMQ handles async operations:

Email notifications — Queued to prevent request blocking
Audit log cleanup — Scheduled removal of old records
Retry logic — Exponential backoff for transient failures
Idempotency — Duplicate job prevention

Observability

Structured logging with Pino
Correlation IDs across requests
Prometheus-compatible metrics endpoint
Health check endpoints for orchestration

Testing Strategy

Test Coverage

Unit Tests:        13+ test cases
E2E Tests:         15+ integration tests
Coverage:          Comprehensive across modules

What's Tested

Service layer business logic
Authorization guard behavior
API contract validation
Error handling paths
Database operations

Technology Stack

Component	Technology	Purpose
Framework	NestJS 10.3	Modular Node.js framework
Language	TypeScript 5.3	Type safety
Database	PostgreSQL 16	Primary data store
ORM	Prisma 5.7	Type-safe queries
Cache	Redis 7	Session, queue backing
Queue	BullMQ 5.0	Background jobs
Auth	Passport + JWT	Authentication
Validation	class-validator	DTO validation
Logging	Pino	Structured logs
Docs	Swagger/OpenAPI	API documentation

Deployment

Infrastructure Requirements

Minimum:

Node.js 20+
PostgreSQL 12+
Redis 6+

Production:

Docker/Kubernetes deployment
Load balancer
SSL termination
Database connection pooling

Configuration

Environment-based configuration with validation at startup. Missing required variables cause immediate failure with clear error messages.

Project Metrics

100+ TypeScript files with strict typing
9 feature modules with clear boundaries
20+ API endpoints with full documentation
28+ test cases across unit and E2E
15+ documentation files

Tradeoffs

Every decision here bought something and cost something. The honest version:

BullMQ + Redis over a simpler in-process queue — bought durability and retry semantics for async work (notifications, payouts); cost an extra piece of infrastructure to run and monitor. Worth it the moment work must survive a restart.
RBAC enforced at the guard layer over per-handler checks — bought one consistent place to reason about authorization; cost some indirection when reading an individual endpoint. The consistency is what prevents security gaps.
Prisma over raw SQL — bought type-safe queries and fast iteration; cost fine-grained control on the hottest queries. The escape hatch (raw SQL) stays available for the few places that need it.
Modular monolith over microservices — bought simple local development and one deploy; cost the independent scaling you'd eventually want. At this stage, splitting early would have been complexity with no payoff.

The goal was never the most sophisticated architecture. It was the simplest one that stays correct under real authorization, real failure, and real concurrency.

Decision Log

The reasoning behind the choices that shaped the system — including the alternatives I rejected and what each choice cost.

Decision	Why it was chosen	Alternative considered	Tradeoff accepted
Modular monolith	One deploy, simple local dev, clear module boundaries	Microservices from day one	Gives up independent scaling until load justifies a split
PostgreSQL + Prisma	Type-safe queries, relational integrity, fast iteration	MongoDB; raw SQL	Less control on the hottest queries (raw SQL kept as an escape hatch)
BullMQ + Redis for async work	Durable, retryable jobs that survive a restart	In-process queue / cron	One more piece of infrastructure to run and monitor
RBAC at the guard layer	A single, consistent place to reason about authorization	Per-handler permission checks	Some indirection when reading an individual endpoint
Idempotent background jobs	Payouts and notifications must tolerate retries safely	Best-effort, at-most-once jobs	Extra bookkeeping (idempotency keys, dedupe) on every job

Failure Modes

What I'd expect to break first, and where the design absorbs it:

Redis / queue outage stalls async work. Payouts and notifications queue up or fail. Mitigation: jobs are durable and idempotent, so they resume on recovery rather than losing or duplicating work.
Payout double-processing. Retries or partial failures could pay twice. Mitigation: idempotency keys and a payout state machine that's safe to replay.
RBAC misconfiguration → privilege escalation. A wrong guard opens data. Mitigation: authorization centralized in one layer and asserted by tests, including negative ("must be denied") cases.
Shared database as the bottleneck. A modular monolith funnels load to one database. Mitigation: clear module boundaries make the highest-traffic modules separable later; read replicas before a split.
Monolith deploy risk. One bad deploy affects everything. Mitigation: strong test coverage at boundaries and a fast rollback path; contract tests planned before any service split.

Lessons Learned

Authorization is a cross-cutting concern, not a feature. Centralizing it early made every later endpoint cheaper and safer to add.
Idempotency is easier designed in than retrofitted. Treating background jobs as "may run more than once" from the start removed a whole class of bugs.
Tests are design feedback. The parts that were hard to test were the parts that were poorly factored — the test suite kept the boundaries honest.

Future Improvements

Split the highest-traffic modules into independently deployable services once load justifies the operational cost.
Add distributed tracing across the API and queue workers for end-to-end visibility.
Introduce contract tests at module boundaries to make a future service split safe.

What This Demonstrates

This project proves capability in:

System Design — Clean architecture that scales
Security Engineering — Defense in depth, not afterthought
Testing Discipline — Comprehensive coverage, not checkbox
Production Thinking — Deployment-ready, not demo-only
Documentation — Clear, maintainable, professional

Role: Backend Architect & Developer
Status: Production Ready
License: MIT

Inspect the Code →