Responsibilities
- Model Development & Training: Design, develop, and train vision-language models using state-of-the-art algorithms and frameworks (e.g., PyTorch, TensorRT).
- Feature Integration: Collaborate with Backend Engineers to integrate AI models into production, optimizing for speed, memory usage, and user experience.
- Data Management: Gather, preprocess, and label image/text datasets to ensure high-quality data pipelines.
- Performance Optimization: Continuously monitor and fine-tune models for scalability, accuracy, and latency on mobile platforms.
- Research & Innovation: Stay current with the latest research and publications in vision-language modeling, applying relevant advancements to our product.
- Cross-Functional Collaboration: Work closely with Product Managers, Mobile Developers, Backend Developers and UX Designers to ensure AI-driven features meet user needs and deliver seamless experiences.
Requirements
- Bachelor’s in Computer Science, Machine Learning, or a related field (Ph.D. is a plus).
- 2+ years of experience working on computer vision, natural language processing, or multimodal AI projects.
- Proficiency in deep learning frameworks (PyTorch, TensorFlow) and libraries (e.g., Hugging Face, OpenCV).
- Experience with large vision-language models
- Understanding of model deployment strategies, distributed training, and optimization techniques (quantization, pruning, etc.).
- Strong problem-solving and analytical skills, with the ability to translate abstract AI concepts into user-facing features.
Shortlisted candidates will be offered a 6 months agency contract employment.