AI Product Engineer - ClickStack
WFA Digital Insight
As demand for AI-powered observability solutions grows, ClickHouse stands out with its innovative approach. With the market for real-time analytics expected to continue its rapid expansion, professionals with expertise in building agents and designing skills are in high demand. ClickHouse, recognized for its cutting-edge technology, offers a unique opportunity for engineers to make a significant impact. Before applying, candidates should be prepared to showcase their experience with production-level AI systems and a deep understanding of developer experience. The current boom in cloud services, with over 250% year-over-year growth in some sectors, underscores the importance of skilled professionals in this field.
Job Description
About the Role
ClickHouse, a leader in real-time analytics and observability, is seeking an experienced AI Product Engineer to join its team in developing the AI layer for its observability platform, ClickStack. This platform is designed to unify logs, metrics, traces, and session replays, enabling engineers to quickly identify root causes of incidents. The successful candidate will focus on building agentic capabilities on top of this petabyte-scale platform, with a strong emphasis on developer experience.The role involves collaborating with a team of innovators who are passionate about transforming how companies use data. ClickHouse has already made significant strides, with more than 3,000 customers and a growth rate of over 250% year-over-year, validating its position as a pioneer in the industry.
What You Will Do
- Build agents that can investigate incidents, surface anomalies, and provide concise summaries to on-call teams.
- Design and implement skills that capture the team's debugging processes, ensuring agents can pick up the right playbook instead of starting from scratch.
- Own the agent stack end-to-end, including context engineering, tool design, evaluations, tracing, and cost management.
- Develop the MCP servers, SDKs, and integrations necessary for customers' agents to read telemetry, take action, and remain observable.
- Collaborate with open-source contributors and customers to debug problems, learn from the experiences, and feed these insights back into the product.
- Tackle complex challenges such as latency, cost, context window limits, evaluation coverage, and hallucinations on real telemetry.
- Work closely with the development team to ensure the AI layer integrates seamlessly with the existing observability platform.
- Participate in the planning and execution of product roadmap initiatives related to AI and observability.
- Stay updated with the latest developments in AI, machine learning, and observability, applying this knowledge to improve ClickStack.
What We Are Looking For
- 5+ years of software engineering experience, with at least 1-2 years focused on LLM-powered systems or agents in production.
- Strong backend skills in TypeScript/Node.js and/or Python, with the ability to work comfortably in both.
- Hands-on experience building agents, including multi-step tool use, planning, memory, and error recovery, with a track record of shipping these agents and dealing with their failure modes.
- Experience designing skills, such as Markdown-based workflow encodings, and a clear understanding of their application.
- A strong understanding of production terms, including p99 latency, cost per task, and system sustainability without intervention.
- The ability to move quickly, ship often, and learn from failures.
- A passion for developer tools and a clear sense of what good developer experience looks like.
- Comfort with ambiguity and ownership, capable of working independently and as part of a team.
Nice to Have
- Experience with observability platforms and tools.
- Knowledge of cloud services, particularly those related to AI and machine learning.
- Participation in open-source projects, especially those related to AI, observability, or developer tools.
- Familiarity with Agile development methodologies and version control systems like Git.
Benefits and Perks
- The opportunity to work on a cutting-edge observability platform with a leader in the industry.
- Collaborative and dynamic work environment with a team of experienced professionals.
- Flexible remote work arrangements, allowing you to work from anywhere in the United Kingdom.
- Competitive compensation package, reflective of your skills and experience.
- Access to the latest tools and technologies, ensuring you stay at the forefront of your field.
- Ongoing training and development opportunities, supporting your career growth and aspirations.
- A culture that values innovation, creativity, and teamwork, recognizing and rewarding outstanding contributions.
How to Stand Out
- Showcase your experience with production-level AI systems, highlighting any agent-building or skill-designing skills.
- Prepare to discuss your understanding of developer experience and how you've improved it in previous roles.
- Emphasize your ability to work independently and as part of a team, highlighting collaborations and leadership experiences.
- Be ready to walk through your process for tackling complex challenges like latency and cost in AI systems.
- Highlight any open-source contributions, especially those related to AI, observability, or developer tools, as this demonstrates your commitment to the field and willingness to collaborate.
- When discussing your experience with LLM-powered systems, focus on specific achievements and lessons learned from shipping and maintaining these systems in production environments.
This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.