Job Description
Req number:
R6819Employment type:
Full timeWorksite flexibility:
HybridWho we are
CAI is a global technology services firm with over 8,500 associates worldwide and a yearly revenue of $1 billion+. We have over 40 years of excellence in uniting talent and technology to power the possible for our clients, colleagues, and communities. As a privately held company, we have the freedom and focus to do what is right—whatever it takes. Our tailor-made solutions create lasting results across the public and commercial sectors, and we are trailblazers in bringing neurodiversity to the enterprise.
Job Summary
We are looking for a motivated Databricks Developer ready to take us to the next level! If you understand ETL/ELT pipelines in Databricks using PySpark/Spark SQL and Delta Lake and are looking forward to your next career move, apply nowJob Description
We are looking for Databricks Developer . This position will be full-time and Hybrid from Bangalore.
What You’ll Do
- Design & develop scalable ETL/ELT pipelines in Databricks using PySpark/Spark SQL and Delta Lake.
- Build and maintain notebooks, jobs, workflows, and Delta Live Tables (DLT) for batch/streaming ingestion.
- Optimize Spark performance (partitioning, caching, broadcast joins, shuffle tuning) and cost management on clusters.
- Implement data quality checks (e.g., expectations, unit tests), schema evolution, CDC, and medallion architecture (bronze/silver/gold).
- Collaborate with BI/Analytics to deliver curated datasets and semantic layers; document lineage and data contracts.
- Automate deployments via CI/CD.
- Secure data with Unity Catalog/ACLs, row‑level permissions, and compliance best practices.
- Monitor & troubleshoot jobs using Databricks Job UI, drive incident root‑cause analysis.
- Provide tier 3+ support in-group and out of group customers as well as make safe changes to production systems (OS, database, and application).
- Expertise in DB scripting on SQL or any DB.
- Stay up-to-date with the latest trends, tools, and techniques in data analytics, big data, and related technologies. Explore and evaluate emerging technologies and methodologies to drive innovation and improve data and analytics capabilities.
What You'll Need
Required:
- Strong PySpark/Spark SQL; solid understanding of distributed computing concepts (RDDs, DataFrames, partitions).
- Hands‑on with Databricks clusters, Jobs, Repos, Workflows, Delta Lake, and Unity Catalog (or equivalent).
- Experience with data modeling for analytics
- Cloud proficiency (Azure preferred: ADLS/ABFSS, Key Vault, Data Factory/Synapse; or AWS S3/Glue/MSK; or GCP).
- CI/CD for data workflows.
- Strong SQL and one scripting language (Python preferred).
- Familiarity with data quality frameworks (assertions, unit/integration tests).
Good to have:
- Streaming (Structured Streaming, Kafka/Event Hub) knowledge.
- Delta Live Tables (advanced), Photon runtime, SQL Warehouses.
- Knowledge of cost optimization (cluster sizing, autoscaling, Spot/Low‑priority nodes).
- Exposure to ML pipelines (MLflow) for feature engineering (not mandatory).
- Experience integrating with Power BI/Tableau; semantic models; performance tuning.
Physical Demands
- Sedentary work that involves sitting or remaining stationery most of the time with occasional need to move around the office to attend meetings, etc.
- Ability to conduct repetitive tasks on a computer, utilizing a mouse, keyboard, and monitor.
Reasonable accommodation statement
If you require a reasonable accommodation in completing this application, interviewing, completing any pre-employment testing, or otherwise participating in the employment selection process, please direct your inquiries to or (888) 824 – 8111.
Apply for this Position
Ready to join ? Click the button below to submit your application.
Submit Application