Job Description
is a leading provider of high-quality, ethically sourced data for Artificial Intelligence (AI) and Machine Learning (ML) model training. We host the world's largest AI marketplace and offer end-to-end services to help companies accelerate their AI solutions. Backed by significant funding and recognized globally for our commitment to ethical AI, we operate in a fast-paced, innovative environment with offices in Seattle and Lisbon.
This is a hybrid or remote position.
What will you do?
Pipeline Orchestration
Design and maintain end-to-end data workflows using Dagster, handling complex dependencies, retries, backfills, and observability.
Build asset-based pipelines with clear ownership, lineage, and SLAs.
Data Transformation
Develop modular dbt models (staging, intermediate, marts) to transform raw data into clean, production-grade datasets.
Apply best practices in testing, documentation, and versioning.
Data Ingestion & Python Development
Write robust Py...
Apply for this Position
Ready to join Defined? Click the button below to submit your application.
Submit Application