Senior Data Scraping Engineer (Python)

MindriftMindrift·Remote(Ireland)
Software Development
Google Sheets

WFA Digital Insight

With the exponential growth of AI and machine learning, the demand for skilled data scraping engineers has skyrocketed, growing over 27% in the past two years. As companies like Mindrift pioneer hybrid AI-human systems, professionals with expertise in Python, web scraping, and data extraction are in high demand. With the rise of Generative AI, the need for accurate and reliable data has never been more pressing. Before applying, candidates should know that this role requires a unique blend of technical precision, domain expertise, and quality control, making it an attractive opportunity for those looking to advance their skills in a rapidly evolving field.

Job Description

About the Role

The Senior Data Scraping Engineer role at Mindrift is a part-time remote opportunity that involves driving specialized data scraping workflows within a hybrid AI + human system. As a key member of the team, you will collaborate with Tendem Agents to deliver accurate and actionable results, leveraging your expertise in Python, web scraping, and data extraction. Your day-to-day tasks will entail handling data scraping tasks requiring technical precision, utilizing various tools such as Apify and OpenRouter, and providing critical thinking and domain expertise to ensure high-quality results.

The role is part of the Tendem project, which connects specialists with AI projects from major tech innovators. Your work will contribute to the development of Generative AI capabilities, advancing real-world applications and unlocking the potential of AI. You will work closely with a team of experts, sharing knowledge and best practices to achieve common goals.

The company's mission is to unlock the potential of Generative AI by tapping into real-world expertise from across the globe. As a Senior Data Scraping Engineer, you will play a key role in achieving this mission, working on complex projects that require technical precision, creativity, and attention to detail.

What You Will Do

  • Own end-to-end data extraction workflows across complex websites, ensuring complete coverage, accuracy, and reliable delivery of structured datasets
  • Leverage internal tools (Apify, OpenRouter) alongside custom workflows to accelerate data collection, validation, and task execution while meeting defined requirements
  • Ensure reliable extraction from dynamic and interactive web sources, adapting approaches as needed to handle JavaScript-rendered content and changing site behavior
  • Enforce data quality standards through validation checks, cross-source consistency controls, adherence to formatting specifications, and systematic verification prior to delivery
  • Scale scraping operations for large datasets using efficient batching or parallelization, monitor failures, and maintain stability against minor site structure changes
  • Collaborate with Tendem Agents to handle repetitive tasks, providing critical thinking, domain expertise, and quality control to deliver accurate and actionable results
  • Utilize Google Sheets to store and manage extracted data, ensuring data cleanliness and organization
  • Develop and maintain custom workflows to extract data from complex websites, utilizing Python and web scraping techniques
  • Participate in performance-based bonus programs, receiving rewards for high-quality work and consistent delivery

What We Are Looking For

  • At least 3 years of relevant experience in data engineering, web scraping, automation, or software development
  • Strong experience in Python web scraping (BeautifulSoup, Selenium or similar), including dynamic content (JS, AJAX, infinite scroll) and APIs via proxies
  • Proven ability to extract data from complex structures (hierarchies, archived pages, inconsistent HTML)
  • Solid background in data cleaning, normalization, and validation, delivering structured datasets (CSV, JSON, Google Sheets)
  • Hands-on experience with LLMs and AI frameworks to enhance automation and problem-solving
  • Strong attention to detail and commitment to data accuracy
  • Self-directed work ethic with ability to troubleshoot independently
  • Bachelor's or Master’s Degree in Engineering, Applied Mathematics, Computer Science, or related technical fields
  • English proficiency: Upper-intermediate (B2) or above

Nice to Have

  • Experience with Apify and OpenRouter
  • Knowledge of data storage solutions such as Google Sheets
  • Familiarity with performance-based bonus programs
  • Experience working with remote teams and collaborating with specialists from diverse backgrounds

Benefits and Perks

  • Competitive hourly rate, with the potential to earn up to $37 per hour equivalent
  • Opportunity to work on complex and challenging projects, advancing your skills and expertise in data scraping and extraction
  • Collaborative and dynamic work environment, with a team of experts in AI and machine learning
  • Flexible and remote work arrangement, allowing you to work from anywhere with a stable internet connection
  • Performance-based bonus programs, rewarding high-quality work and consistent delivery
  • Opportunity to contribute to the development of Generative AI capabilities, advancing real-world applications and unlocking the potential of AI
  • Access to cutting-edge tools and technologies, including Apify and OpenRouter
  • Professional development opportunities, including training and mentorship programs
  • Recognition and rewards for outstanding performance and contributions to the team

How to Stand Out

  • Develop a strong portfolio showcasing your expertise in Python, web scraping, and data extraction, highlighting complex projects and achievements.
  • Familiarize yourself with Apify and OpenRouter, utilizing these tools to accelerate data collection and validation.
  • Showcase your ability to work independently, troubleshooting issues and adapting to changing site behavior and requirements.
  • Highlight your experience working with remote teams and collaborating with specialists from diverse backgrounds, demonstrating your ability to communicate effectively and work towards common goals.
  • Be prepared to discuss your approach to data quality and validation, showcasing your attention to detail and commitment to delivering high-quality results.
  • Research Mindrift and the Tendem project, demonstrating your understanding of the company's mission and vision, and how your skills and expertise align with their goals and objectives.

This is a remote position listed on WFA Digital, the platform for professionals who work from anywhere. Browse more remote jobs across all categories.