仕事内容
<div class="content-intro"><h3><strong><span style="font-family: arial, helvetica, sans-serif;">ABOUT xAI</span></strong></h3>
<p><span style="font-family: arial, helvetica, sans-serif;">xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. </span><span style="font-family: arial, helvetica, sans-serif;">Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. </span><span style="font-family: arial, helvetica, sans-serif;">We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company’s mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. </span><span style="font-family: arial, helvetica, sans-serif;">All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.</span></p></div><h3>About the Role:</h3>
<ul>
<li>We are building the high-performance inference platform that serves Grok to millions of users every day with lightning speed and perfect reliability.</li>
<li>As a Member of Technical Staff - Inference, you will design and optimize large-scale model serving systems end-to-end. You will own everything from distributed infrastructure (global KV cache, continuous batching, load balancing, auto-scaling) to deep low-level optimizations (GPU kernels, quantization, speculative decoding, tail latency).</li>
<li>This is a high-impact role where your work directly determines how fast and reliably users interact with Grok at massive scale</li>
</ul>
<p><span style="font-size: 14pt;"><strong>Responsibilities: </strong></span></p>
<ul>
<li>Architect and implement scalable distributed infrastructure for model serving (load ba