Python/ Machine Learning Engineer (Regular/Senior)
We foster a dynamic culture rooted in strong engineering, a sense of ownership, and transparency, empowering our team. As part of the expanding VirtusLab Group, we offer a compelling environment for those seeking to make a substantial impact in the software industry within a forward-thinking organization.
About the role
You will be responsible for building and owning data pipelines on a Spark Kubernetes cluster orchestrated with Airflow using PySpark. You will improve and introduce data validation and monitoring to ensure trustworthy data at every stage. Tasks will include provisioning and managing Azure resources using a mature Infrastructure as Code approach, as well as automating everything with GitHub Actions and maintaining CI/CD workflows. You will enhance monitoring to further improve the reliability and stability of deployed ML solutions using the Grafana/Prometheus stack. Additionally, you will collaborate with cross functional teams to ensure the seamless deployment and serving of ML models and actively shape the project’s technical roadmap and direction.
Loss prevention in retail involves the strategic implementation of processes and technologies designed to identify, mitigate, and prevent the disappearance of inventory. To achieve that an Engineering and a Data Science team within a major UK retailer partner to bridge the gap between experimental ML models and robust, production-grade systems. By embedding engineering excellence into the data science lifecycle, the team ensures that loss prevention insights are delivered with high reliability.
In this project you will not only develop high-quality Python code, but also implement trustworthy data pipelines on a big Spark cluster orchestrated with Airflow, setup highly automated CI/CD pipelines with Github Actions, and provision Azure infrastructure as code with Terraform.
Python, PySpark, Airflow Azure, IaC (Terraform), CI/CD (Github Actions), Observability (Grafana/Promotheus), MLOps, Kubernetes
- Establish a resilient MLOps Ecosystem by integrating robust observability, experiment tracking and automated deployment to model serving infrastructure.
- Improve the reliability and observability of data pipelines to guarantee trust-worthy data.
- Advancing DevOps Maturity through the implementation of standardized pipelines, enabling rapid iteration and minimizing manual intervention.
3 Engineers
As an ML Engineer in Forecasting and Commodities, you will be involved in projects that support critical decision making processes, by applying your Python, PySpark, Kubernetes and Cloud (Azure) skills. You will be working in a technically mature ecosystem, implementing new features and covering new use-cases. Part of your responsibilities will be design and implementation of a data science innovation framework, as well making contributions to an overall engineering best practises of the organization.
– Developing libraries, tools, and frameworks that standardise and accelerate development and deployment of machine learning models.
– Working in an Azure cloud environment, developing model training code in AzureML. Building and maintaining cloud infrastructure with IaC (infrastructure as code).
– Working with distributed data processing tools such as Spark, to parallelise computation for Machine Learning.
– Diagnosing and resolving technical issues, ensuring availability of high-quality solutions that can be adapted and reused.
– Collaborating closely with different engineering and data science teams, providing advice and technical guidance to streamline daily work.
– Championing best practices in code quality, security, and scalability by leading by example.
– Taking your own, informed decisions moving a business forward.
Python, PySpark, Airflow, Docker, Kubernetes, Azure (incl. Azure ML), pandas, scikit-learn, numpy, GitHub Actions, Azure DevOps, Terraform, Git @ GitHub
– Building a system that provides accurate and up-to-date business forecasts, by providing a set of tools that can be easily leveraged by data scientists and analysts.
– Streamlining the process of onboarding, deployment and patching new ML pipelines.
– Collaborating with cross-functional teams enhancing customer experiences through innovative technologies.
– Employing DevOps practises for reproducible patterns in multiple business domains.
1 engineer from VL, two from client side
What we expect in general:
- Strong experience in writing high-quality Python code and deploying production-level projects.
- Proactiveness and a strong sense of ownership, taking full responsibility of project outcomes.
- Significant experience in Data Engineering, specifically with PySpark, data quality monitoring and workflow orchestration.
- Proficiency in Azure (or equivalent cloud providers) and hands-on experience with Infrastructure as Code principles.
- Robust DevOps mindset with practical experience automating CI/CD pipelines via GitHub Actions.
- A dedicated team player with excellent communication skills who thrives within a cross-functional, collaborative environment.
- Good command of English (B2/C1 level), comfortable utilizing the language daily.
- A hybrid model is preferred (2-3 days per week in the Kraków office); alternatively, candidates must be available for on-site collaboration as required (approx. once a month).
Seems like lots of expectations, huh? Don’t worry! You don’t have to meet all the requirements.
What matters most is your passion and willingness to develop. Apply and find out!
A few perks of being with us
Apply now
"*" indicates required fields