About Pedro J.
- Architecture reviews and design validation
- Production issue investigation and postmortem analysis
- Cloud infrastructure assessment and improvement
- Scaling strategies and failure mode audits
Spanish
Native or bilingual
Catalan
Native or bilingual
English
Native or bilingual
Experience
- MicrosoftSenior Software EngineerDecember 2021 - Today (4 years and 6 months)Barcelona, Spain
- Lead reliability engineering for Azure Managed Disks, ensuring the resilience of a core storage platform powering millions of VMs across global regions.
- Conduct deep-dive troubleshooting of distributed systems, uncovering root causes of complex failures across compute, storage, and control plane components.
- Collaborate with engineering teams and PMs to prioritize architectural improvements, influence feature design, and steer product direction toward fault-tolerant patterns.
- Design and build internal tooling for anomaly detection, incident triage, and automated risk analysis — accelerating time-to-diagnosis and reducing MTTR.
- Develop and extend monitoring systems to proactively surface disk-related regressions, capacity hotspots, and service-level risks.
- Lead post-incident reviews and cross-team investigations, ensuring reliability insights feed directly into platform improvements.
- Champion a culture of resilience by embedding reliability considerations into design reviews, operational readiness checks, and roadmap planning.
- Amazon Web Services (AWS)Senior Technical Account ManagerSeptember 2020 - December 2021 (1 year and 3 months)Barcelona, Spain
- Acted as the primary technical advisor to enterprise customers, guiding their successful use of AWS services across architecture, operations, and security.
- Built deep relationships with customer engineering teams, providing strategic guidance, performance reviews, and tailored best practices.
- Delivered operational reviews and readiness assessments for mission-critical workloads, ensuring high availability and resilience.
- Proactively identified risks, scaling bottlenecks, and architectural flaws in customer deployments — and led remediation strategies in collaboration with AWS specialist teams.
- Advocated on behalf of customers internally to influence service roadmaps and feature prioritization.
- Supported incident response, root cause analysis, and recovery planning for critical events.
- Helped customers adopt well-architected principles, optimize cloud costs, and improve performance across distributed systems.
- Collaborated cross-functionally with Solutions Architects, Support, and Engineering to deliver consistent, high-value customer outcomes.
- ElasticSite Reliability Engineer IIMay 2019 - September 2020 (1 year and 4 months)Reading, UK
- Designed and deployed highly available network connectivity between Elastic Cloud and Microsoft Azure, enabling robust and fault-tolerant cross-region communication for internal services.
- Led the migration of Elastic's image creation pipeline to leverage Azure Shared Image Galleries, replacing a legacy snapshot-based approach with a scalable and maintainable system.
- Improved deployment reliability and boot time consistency across Elastic Cloud by streamlining image distribution workflows.
- Contributed to infrastructure automation and monitoring efforts using tools like Terraform Participated in the on-call rotation, handled incident response, and drove root cause investigations across cloud environments.
- Worked closely with engineering and product teams to embed reliability into platform features and streamline operational workflows.
Recommendations
Be the first to recommend Pedro J.
Help this freelancer shine by sharing your experience working together.
These freelancer profiles also match your criteria
Agatha Frydrych
Backend Java Software Engineer
4.7
(3)
2
Baptiste Duhen
Fullstack developer
4.6
(4)
5
Amed Hamou
Senior Lead Developer
4
(2)
7
Audrey Champion
Web developer
4.3
(3)
4
Education
- IES SalesUniversitat Politècnica de Catalunya2005IES Sales