Senior Machine Learning Engineer on Red Hat's AI Inference Team
Link⚡ TL;DR
📝 Summary
Senior Machine Learning Engineer on Red Hat's AI Inference Team Red Hat Learning Subscription | Product Trial About the authors Brian Dellabetta Holly Bailey Vanshika Arora More like this Blog post Blog post Original podcast Original podcast Keep exploring Browse by channel Automation Artificial intelligence Open hybrid cloud Security Edge computing Infrastructure Applications Virtualization Share Brian D. is a senior machine learning (ML) engineer on our AI Inference team, which is part of the broader AI Engineering team at Red Hat. Based remotely in Chicago, Brian helps maintain LLM Compressor , a key component of vLLM (an open source inference server originally developed at UC Berkeley, and now supported by a global community). vLLM is designed to make AI inference —in other words, responses from models—more efficient. Through LLM Compressor, Brian and his team make it possible to optimize and deploy LLMs so they run faster, consume less energy, and operate on fewer GPUs, without compromising performance. The resulting impact? Lower computational barriers to working with AI—which, in turn, opens the door for more organizations, researchers, and innovators to use these models in meaningful ways. We sat down with Brian to learn more about his journey, his team, and life as a ML engineer. Tell us about your journey to Red Hat and AI I joined Red Hat in January through the acquisition of Neural Magic. I was actually in the middle of the interview process with Neural Magic when the acquisition was announced. It was great timing: my first week was the same week the entire Neural Magic team met in Boston for Red Hat’s new hire orientation. I’ve been in the AI/ML field for several years, but I’m still new to this role at Red Hat. My career path has been a gradual shift toward AI.
Open the original post ↗ https://www.redhat.com/en/blog/senior-machine-learning-engineer-red-hats-ai-inference-team