UP.Labs is a dynamic venture studio dedicated to building innovative startup companies from the ground up. We're seeking a skilled AI Agent Engineer to join our growing team and contribute to our mission of launching the next wave of successful startups.

Requirements

  • Design, build, and deploy agentic workflows (multi-step LLM chains with tool calling, retrieval, and structured output) for real-time, business-critical use cases.
  • Engineer for determinism and consistency by implementing constrained decoding, structured outputs, caching layers, and evaluation harnesses.
  • Build and maintain evaluation and regression frameworks — automated pipelines that measure accuracy, latency, and behavioral consistency across prompt and model changes.
  • Integrate LLM agents with external tools and APIs (databases, rules engines, business systems) using frameworks like LangFuse, LangChain, LangGraph, CrewAI, or custom orchestration.
  • Deploy agentic systems on cloud infrastructure (AWS, Azure, and/or GCP), optimizing for low-latency inference and cost efficiency.
  • Implement guardrails, fallback logic, and observability to ensure agents fail gracefully and every decision is traceable.
  • Collaborate with data scientists, software engineers, and business stakeholders to translate business rules into agent behavior and tool definitions.
  • Stay current with the latest advancements in AI agents, large language models, and cloud technologies.
  • Practical, hands-on experience building and deploying agentic AI systems in production environments.
  • Proficiency in Python and experience building production backend systems.
  • Experience with LLM APIs (OpenAI, Anthropic, etc.) and agentic frameworks (LangFuse, LangChain, LangGraph, CrewAI, AutoGen, or equivalent).
  • Strong understanding of prompt engineering for reliability: structured outputs, few-shot patterns, chain-of-thought, and techniques that minimize hallucination.
  • Experience building evaluation and testing pipelines for AI systems, including behavioral evals and golden-set testing.
  • Expertise in at least one major cloud provider (AWS, Azure, and/or GCP) and containerized deployment (Docker, Kubernetes).
  • Familiarity with vector databases (Pinecone, Weaviate, pgvector) and retrieval-augmented generation (RAG) patterns.
  • Solid knowledge of version control systems (e.g., Git) and CI/CD pipelines.
  • Strong problem-solving skills and ability to work collaboratively across teams.

Benefits

  • Work-Life Balance