Senior ML Ops Engineer
Debtsy
At January, we’re transforming the lives of borrowers by bringing humanity to consumer finance. Our data-driven products empower financial institutions to streamline their collections, providing borrowers with straightforward and compassionate solutions to regain financial stability and control over their lives. We’re not just expanding access to credit. We’re restoring dignity and paving the way for millions to achieve financial freedom.
About the role:
As a Sr ML Ops Engineer, you'll be instrumental in deploying predictive machine learning models at scale at January. As a member of the DevX team, you’ll work with data engineers and software engineers to define our MLOps architecture and set up automated systems. The systems and processes you design will validate input data, retrain models, and manage their throughput and latency.
What You’ll Work On
Deploy and maintain pipelines: Ensure the efficient and error-free deployment of machine learning models, ultimately enabling product teams to consume batch predictions on a daily basis.
Implement Continuous Integration/Continuous Deployment (CI/CD) for ML models: Create and manage the automated pipelines for machine learning workflows, including training, testing, and deploying. Accelerate the ML lifecycle, ensuring the continuous availability of updated and optimized models, and reducing manual errors.
Production monitoring of ML models: Ensure that models in production are closely monitored for performance decay and logged for audit purposes, to maintain the quality of the service provided by the model.
Workflow optimization: Work with distributed systems to ensure that ML models can handle larger datasets and high loads in production. Provide robust, scalable machine learning solutions that can cater to growing business needs, ensuring high availability and performance.
Model versioning and rollback: Manage different versions of ML models and their associated data, enabling efficient experimentation, debugging, and rollback to a previous version if necessary. Maintain model reliability, reduce risk, and provide a rapid response to any issues with the current model version.
Troubleshooting: Ensures that ML models perform well in a production environment, minimizing the risk of deploying faulty models and maintaining the integrity and reliability of AI-enabled services.
What You Bring to the Table
Cloud MLOps: Experience productionizing AI/ML models, particularly in an AWS environment.
CI/CD and Automation: Proficiency in CI/CD tools such as GitHub Actions, Jenkins, CircleCI, or similar.
Performance Optimization: Skills in optimizing build and test pipelines to improve efficiency. Ability to diagnose and resolve performance bottlenecks in applications and infrastructure.
Monitoring and Observability: Experience with monitoring tools (e.g., Datadog, Grafana, Prometheus). Understanding of logging and observability tools (e.g., ELK Stack, Splunk).
Strategic thinking and ownership: The ability to identify business needs and trends, playing an active role in discovering, advocating for, and implementing ideas.
Problem-Solving and Troubleshooting: Proven ability to analyze existing systems and processes to find areas for enhancement. Makes rapid, thoughtful recommendations, despite uncertainty.
Collaboration and mentorship: A strong commitment to mentoring and fostering a collaborative team environment.
Documentation and Communication: Ability to create clear, comprehensive documentation for processes and tools. Speaks and writes clearly and articulately, tailoring the message to the target audience.
You might be a fit if you have:
Genuine collaboration. We recognize and explicitly reward feedback, alignment, and learning from each other.
Managers as advocates. You’ll report to leaders invested in your growth and promotion, not just their own. Managers only succeed when their teams thrive.
Outcomes over output. We value finding ways to do more with less, working smart instead of burning out.
Product ownership. Engineers are owners, not mercenaries. Everyone plays a role in discovering, advocating for, and implementing ideas.
Technical quality. We work collaboratively to ensure great outcomes without sacrificing quality. Sustainable development is a first class priority.
We are an equal opportunity employer committed to diversity and inclusion in the workplace. We do not discriminate based on race, color, religion, sex, sexual orientation, gender identity, national origin, disability status, age, veteran status, or any other legally protected characteristic.
As a New York City-based company, we are dedicated to transparent, fair, and equitable compensation practices that reflect our commitment to fostering an environment where all team members are valued and supported. We encourage individuals from all backgrounds to apply.