Job Summary:
We are looking for a Gen AI Engineer to drive the development and optimization of LLM (Large Language Model) and embedding model integrations on our AI platform. This role involves implementing prompt engineering frameworks, RAG (Retrieval-Augmented Generation) fine-tuning, and performance benchmarking, ensuring high-quality AI-driven solutions.
The ideal candidate should have strong expertise in Python, MLOps, PyTorch, and AI model fine-tuning, along with a deep understanding of retrieval algorithms, agentic RAG architectures, and evaluation methodologies.
Key Responsibilities:
- Develop and optimize LLM and embedding model integrations for various applications.
- Implement prompt engineering frameworks and reusable templates for diverse AI-driven use cases.
- Design evaluation metrics and testing frameworks to benchmark Agentic RAG performance.
- Enhance retrieval algorithms and relevancy scoring mechanisms for AI-based search.
- Develop features for custom AI model fine-tuning and continuous adaptation.
- Work on function calling and tool integrations to enhance AI decision-making capabilities.
- Ensure API-based model deployment and optimization for enterprise-grade AI solutions
Required Skills and Qualifications:
- 4-5 years of hands-on experience in AI/ML engineering with a focus on LLMs, RAG, and AI model tuning.
- Strong expertise in Python, MLOps, and PyTorch for AI/ML model development.
- Proficiency in prompt engineering, RAG fine-tuning, and AI-driven retrieval methods.
- Experience with AI model evaluations, performance benchmarking, and accuracy testing.
- Deep understanding of REST APIs for AI model deployment and real-time interactions.
- Familiarity with tool and function calling mechanisms in AI-driven applications.
- Ability to work in an Agile, fast-paced environment, collaborating with cross-functional teams.
Preferred Skills (Good to have):
- Experience working with enterprise AI applications and custom model adaptations.
- Hands-on experience in vector search optimization for AI-based retrieval.
- Knowledge of LLM deployment strategies on cloud platforms (AWS, Azure, or GCP).
- Exposure to multi-modal AI models and hybrid RAG architectures.
Location:
ThoughtsWin Systems, Jaipur, India