Job Description

About the Project:

We are looking for a skilled engineer for one of our clients to join a dynamic engineering team working on a high-impact tax reporting platform. The primary objective of this project is to modernize and significantly accelerate Excel-based report generation, reducing processing time from minutes to seconds. The role involves ingesting data from multiple upstream systems, transforming it using efficient data processing libraries, and exposing it through APIs. The team focuses on scalability, maintainability, and developer productivity, leveraging spec-driven development and AI-powered tools.


You’ll be contributing to the backend architecture and data pipeline powering this transformation, helping to evolve a high-performance system that's central to the client’s reporting domain.


Responsibilities:

  • Design, build, and maintain high-performance data processing pipelines using Python libraries (Pandas required; Polars nice-to-have).
  • Develop and expose RESTful APIs using FastAPI or similar frameworks.
  • Consume and process normalized Parquet files from multiple upstream sources to generate dynamic Excel reports.
  • Contribute to a spec-driven development workflow (using GitHub Copilot, Claude, etc.) to scaffold and generate API/data pipeline code.
  • Optimize report generation logic for speed and scalability, currently targeting sub-20 second response times.
  • Integrate with messaging and storage mechanisms (e.g., Service Bus, Storage Accounts).
  • Collaborate on infrastructure-as-code automation using Bicep similar tools (Terraform, CDK).
  • Participate in design discussions for future migration to Snowflake and/or a data lake architecture.
  • Contribute to CI/CD pipelines using GitHub Actions.


Required Skills and Experience:

  • Strong proficiency in Python for data processing with hands-on expertise in Pandas.
  • Ability to quickly learn new frameworks such as Polars if needed.
  • Experience building backend services or APIs using frameworks like FastAPI.
  • Solid understanding of data modeling principles (Star Schema) and handling normalized datasets.
  • Familiarity with enterprise messaging patterns and data integration from various sources (API-based and file-based).
  • Experience working with GitHub and CI/CD pipelines (GitHub Actions or similar).
  • Infrastructure-as-Code experience with Bicep or comparable tools (Terraform, AWS CDK).
  • Comfort with spec-driven development and leveraging AI tools like GitHub Copilot for scaffolding.


Nice-to-Have / Preferred Qualifications:

  • Experience with Polars (not required).
  • Experience with orchestration tools like Apache Airflow or ETL frameworks like DBT.
  • Exposure to Snowflake (streams, tasks, stored procedures).
  • Experience working with Duck DB and/or DocTV/OpenTI XLarge for report generation.
  • Knowledge of Angular or frontend plugins for Excel is a plus but not required.
  • Familiarity with async workflows and distributed processing concepts.

Apply for this Position

Ready to join ? Click the button below to submit your application.

Submit Application