Role description
· You are operating GlobalData Platform components (VM Servers, Kubernetes, Kafka) and applications(Apache stack, Collibra, Dataiku and similar)
· Implement automation ofinfrastructure, security components, and Continuous Integration &Continuous Delivery for optimal execution of data pipelines (ELT/ETL).
· Develop solutions tobuild resiliency in data pipelines with platform health checks, monitoring, andalerting mechanisms, quality, timeliness, recency, and accuracy of datadelivery are improved
· Apply DevSecOps &Agile approaches to deliver the holistic and integrated solution in iterativeincrements.
· Liaison and collaboratewith enterprise security, digital engineering, and cloud operations to gainconsensus on architecture solution frameworks.
· Review system issues,incidents, and alerts to identify root causes and continuously implementfeatures to improve platform performance.
· Be current on the latestindustry developments and technology trends to effectively lead and design newfeatures/capabilities.
Experience
· You have 5+ years ofexperience in building or designing large-scale, fault-tolerant, distributedsystems
· Migration experience ofstorage technologies (e.g. HDFS to S3 Object Storage)
· Integration of streamingand file-based data ingestion /consumption (Kafka, Control M, AWA)
· Experience in DevOps,data pipeline development, and automation using Jenkins and Octopus (optional:Ansible, Chef, XL Release, and XL Deploy)
· Experience predominatelywith on-prem Big Data architecture, cloud migration experience might come handy
· Hands-on experience inintegrating Data Science Workbench platforms (e.g. Dataiku)
· Experience of agileproject management and methods (e.g., Scrum, SAFe)
· Supporting allanalytical value streams from enterprise reporting (e.g. Tableau) to datascience (incl. ML Ops)
Skills
· Hands-on workingknowledge of large data solutions (for example: data lakes, delta lakes, datameshes, data lakehouses, data platforms, data streaming solutions...)
· In-depth knowledge andexperience in one or more large scale distributed technologies including butnot limited to: Hadoop ecosystem, Kafka, Kubernetes, Spark
· Expert in Python andJava or another static language like Scala/R, Linux/Unix scripting, Jinjatemplates, puppet scripts, firewall config rules setup
· VM setup and scaling(pods), K8S scaling, managing Docker with Harbor, pushing Images through CI/CD
· Experience using dataformats such as Apache Parquet, ORC or Avro Experience in machine learningalgorithms is a plus.
· Good knowledge of Germanis beneficial, excellent command of English is essential
· Knowledge of financialsector and its products
· Higher education (e.g."Fachhochschule", "Wirtschaftsinformatik")