ResponsibilitiesSupport the operation and maintenance of overseas cloud-based services, ensuring platform stability, reliability, and performance; pro.....
Responsibilities
Support the operation and maintenance of overseas cloud-based services, ensuring platform stability, reliability, and performance; proactively identify and resolve system bottlenecks.
Follow internal operational processes, taking ownership of incident management, service request management, problem management, and change management.
Be responsible for platform software upgrades, as well as the deployment, maintenance, and optimization of core systems.
Handle major incidents and day-to-day operational issues, restore services efficiently, perform root cause analysis, and drive long-term improvements.
Design, develop, and maintain automated operations tools to improve efficiency and optimize operational workflows.
Requirements
Bachelor’s degree or above in Computer Science or a related field, with at least 2 years of relevant experience.
Solid expertise in Linux system operations, and hands-on experience with containers, Kubernetes, Ansible, and other common DevOps / SRE tools.
Experience operating on major cloud platforms such as AWS or Azure is highly preferred.
Proficiency in at least one scripting language such as Python or Shell; experience with Go / C / C++ is a plus.
Good documentation habits, with the ability to write and maintain operational procedures and technical documentation in a timely manner.
Strong service mindset, communication skills, problem-solving ability, and a proactive learning attitude.