Exclusive for the Latam market

Web3 Jobs in Latam

The first job board focused on DeFi, Blockchain, NFTs and Crypto 🇧🇷

100% Web3 Focused
3 Talents
13 Jobs
Web3 Talentos - Latam

Senior AI Inference Engineer

Web3 Talentos - Latam Website

🌐 Remote CLT
Remote — Qualquer lugar do Latam
R$0 – 0
1 Mês atrás
32 views
About the job

About the role:

You will own the inference backbone behind QVAC's local AI stack: the C++ systems layer that makes models run fast, reliably, and predictably on real user hardware. The role is centered on engineering quality at runtime level, including startup behavior, memory pressure, throughput/latency balance, and long-session stability. You will define and evolve the core abstractions that inference features depend on, so new capabilities can be added without sacrificing performance or maintainability. This is a role for someone who enjoys low-level problem solving, clear technical ownership, and building infrastructure that other teams trust in production. Your work directly enables private, on-device AI experiences and helps set the technical foundation for QVAC's next generation of peer-to-peer AI products.

Responsibilities

  • Work on deploying machine learning models to edge devices using the frameworks: llama.cpp, ggml, onnx

  • Collaborate closely with researchers to assist in coding, training and transitioning models from research to production environments

  • Integrate AI features into existing products, enriching them with the latest advancements in machine learning

Job requirements

  • Excellent programming skills in C++, experience in Javascript is a bonus

  • Strong experience with Llama.cpp and ggml inference engines, which facilitates the deployment of models to specific GPU architectures

  • Good understanding of deep learning concepts and model architectures

  • Experience with transformers, LLMs, Diffusion models

  • Demonstrated ability to rapidly assimilate new technologies and techniques

  • A degree in Computer Science, AI, Machine Learning, or a related field, complemented by a solid track record in AI R&D

Bonus points if:

  • You have experience with Javascript/Typescript

  • You understand the difficulties, nuances and importance of p2p technology

  • You have experience with any of Vulkan, Metal and OpenCL

  • You have productionized models



Apply for this job

🚀
Apply on the company website

This job accepts applications directly on the company website. Click the button below to apply.

Apply on company website