Staff Data Engineer
We foster a dynamic culture rooted in strong engineering, a sense of ownership, and transparency, empowering our team. As part of the expanding VirtusLab Group, we offer a compelling environment for those seeking to make a substantial impact in the software industry within a forward-thinking organization.
About the role
Building AI-Powered Web Data Acquisition as a Staff Data Engineer and Tech Lead. Deep technical ownership is the main focus of this crucial position. To solve complex scraping problems and set the highest operational standards, you will develop the architecture for our next-generation, distributed web-crawling and data-extraction solution. As you mentor engineers and establish company-wide best practices across international teams by sharing your knowledge with the community and developing strong networking, this role is also a unique opportunity to develop your personal brand and create a strong community around your technical expertise by attending conferences and planning meetups with VirtusLab’s full support.
Our client is a NASDAQ-listed company that provides a range of solutions to support Go-To-Market (GTM) strategies. They offer a comprehensive B2B database platform that enables sales and marketing professionals to identify, connect with, and engage qualified prospects effectively. The core mission of our client is to equip every company with a complete, 360-degree view of their ideal customer, enhancing every phase of their GTM strategy and boosting their success in achieving business targets.
Python, Java, Airflow, Big Query, Apache Kafka, Snowflake, AWS, GCP, Scrapy, GitHub, Terraform, Jenkins, Starburst, Trino, Apache Iceberg, K8s
- Drive the architecture and development of the next generation, distributed web-crawling and data-extraction solution.
- Deep technical ownership to solve complex scraping problems and establish high operational standards.
- Assume deep technical ownership to solve complex scraping problems and establish high operational standards.
- Mentor and guide engineers to establish and share company-wide best practices across international teams.
The team is currently composed of customer-side engineers. A key responsibility for this role is to provide the technical leadership and direction necessary to build and expand this team.
What we expect in general
Technical Leadership & System Design:
- Proven experience building large-scale data systems from scratch.
- Strong architectural skills in designing scalable, fault-tolerant distributed systems.
- Track record leading complex technical initiatives and driving architecture direction for teams.
- Demonstrated ability to evolve production systems incrementally while maintaining reliability.
- Experience mentoring engineers at all levels and promoting a collaborative culture.
Data Engineering Expertise:
- Deep background in large-scale data engineering (terabytes daily).
- Hands-on experience with cloud data warehouses (BigQuery, Snowflake).
- Experience with Apache Kafka, Kubernetes (GKE/EKS), and orchestration tools (Airflow).
- Familiarity with multi-cloud environments (GCP + AWS).
- Expertise in designing and operating ETL/ELT pipelines.
Client Engagement & Advisory:
- Support the VirtusLab U.S. and international teams by lending senior technical expertise to client-facing activities, including technical discovery sessions, workshops, and solution architecture.
- Conduct requirements analysis and solution discovery, identifying business and technical needs.
- Provide technical consulting and advisory services, recommending appropriate data architectures aligned with customer goals.
- Prepare and review technical sections of commercial offers, including solution descriptions, statements of work (SoWs), project estimates, timelines, and delivery models.
Web Crawling & Data Extraction
- Knowledge of web crawling technologies and advanced scraping (Scrapy or similar).
- Understanding of extracting structured/unstructured web data and SERP extraction.
- Deep awareness of proxy infrastructure management, anti-bot detection, and ethical crawling.
- Familiarity with crawling vendors and AI/LLM-based extraction approaches.
Seems like lots of expectations, huh? Don’t worry! You don’t have to meet all the requirements. What matters most is your passion and willingness to develop. Apply and find out!
*The compensation range for this role varies based on location, experience, and level. For candidates based in New York, the expected salary range is higher compared to those located within a broader commuting radius (up to approximately two hours from the city).
A few perks of being with us
Apply now
"*" indicates required fields