Staff Product Manager (Platform Reliability)
KrakenLeading product strategy and delivery for Kraken's Platform Reliability Group, supporting ~1500 engineers across the business to build and operate reliable, scalable systems that power energy utilities serving 70+ million customer accounts globally. This role brings together product management, platform engineering, and commercial expertise to enable teams to 'build safe and fast', maintaining Kraken's culture of rapid innovation while embedding reliability practices that help utilities make a big green dent in the climate crisis:
- Own roadmap for three teams delivering observability, incident management, and reliability initiatives across 200+ daily deployments to 25+ environments globally.
- Manage incident management including all processes, tooling, and post-incident frameworks. Run fortnightly Platform Health Review for engineering leadership.
- Drive Datadog observability strategy, launching company-wide training and enabling teams through self-service tools, consulting, and automated guardrails.
- Report weekly to SMT on reliability metrics and work with engineering leaders to establish ownership frameworks and standards.
- Shared ownership with Technical Lead on quarterly planning, reliability standards, and driving the 'build safe and fast' culture across Kraken.