Required Skills
- Bachelor’s in computer science (or related) with 1+ years’ relevant experience
- Advanced SQL skills and data warehousing knowledge
- Experience in data mining, profiling, modeling, and ETL design
- Proficient in Python, and PySpark
- Familiarity with Big Data tools (HDFS, Hive, Spark, NiFi, Kafka)
- Comfortable with Linux command line and shell scripting
- Experience building and optimizing data pipelines for batch/stream processing
- Experience with cloud bigdata tech AWS, Databricks and Snowflake
- Strong understanding of distributed systems and data architecture
- Excellent problem-solving and communication skills
- Ability to thrive in a fast-paced, deadline-driven environment
Nice-to-Have Skills
- Experience with GenAI, Agentic AI with MCP’s and machine learning frameworks (e.g., LLMs, TensorFlow, PyTorch)
- Knowledge of cloud platforms (AWS, GCP, Azure)
- Familiarity with MLOps/DataOps and CI/CD pipelines
- Experience with data visualization tools (Tableau, Power BI)
- API development skills
- Industry-specific experience (finance, healthcare, etc.)
This is a hybrid position. Expectations of days in office will be confirmed by your hiring manager.