Posted 16 May 2023
Job type Permanent
Reference 1551143

Company's Benefits

  • Flexible Working Arrangements

    Flexible Working Arrangements

  • Equal Pay Initiatives

    Equal Pay Initiatives

  • Mentorship Program

    Mentorship Program

  • Leadership Development Program

    Leadership Development Program

  • Paid Parental Leave

    Paid Parental Leave

  • Return to Work Policy

    Return to Work Policy

  • Childcare Facilities

    Childcare Facilities

  • Breastfeeding Rooms

    Breastfeeding Rooms

  • Sponsorship Program

    Sponsorship Program

  • Coaching Program

    Coaching Program

  • Raise Numbers Of Women In Leadership

    Raise Numbers Of Women In Leadership

  • Internal Women's Networking Group

    Internal Women's Networking Group

Job Description


We are the Azure Data teams part of the Azure organization. Azure Data teams present interesting challenges in technologies such as operational store, big-data, and governance. We build reliable, highly scalable and highly performing distributed systems for governance and data analytics on Azure. We work with various open-source technologies like Atlas, Spark, ElasticSearch, Kubernetes, etc., and make contributions to these technologies. We are building next generation of Microsoft’s exa-scale Data Governance service, leveraging Microsoft Purview, and help us take it to the next level of functionality and scale. This is a once in a lifetime opportunity to be part of a very agile team, take on hard distributed system problems and ship mission-critical features at a rapid pace. Microsoft puts customers in control of their personal data and backs this commitment with continuous investment in governance infrastructure. Our team is responsible for Governance Compliance of all companies’ data, and we are cataloging, classifying, and enforcing policies for billions of data assets daily. We need collaborative developers who can think big, deliver on those big challenges, and along the way, change the world. We’re looking for engineers to build them from the ground up.

  • Faster career growth due to high visibility and high business impact of the service.

  • Autonomy to drive major feature areas.

  • Cutting edge technologies (Spark, Kubernetes, Atlas, Egeria, Elastic Search, Flow Engine, GraphDB, NLP, connector framework, Scanning, Airflow).

  • Collaborative, supportive culture

Azure Data Governance China team has openings ranging from entry to Senior. We are seeking top talents with passion for big data, data discovery and data governance. You will learn cutting edge big data services, like catalog, Spark, Kubernetes, lineage, scanning, workflow and search engine. You are expected to learn hundreds of industrial data stores, build scalable scanning and metadata connector framework to retrieve metadata and classification from those stores in an extensible and scalable approach. You will have opportunities to reach out to customers to understand / solve a real customer pain points.


We are looking for a great software engineer with experiences in backend services and data pipelines to build the next generation of Microsoft's data governance services.

  • Design and develop large scale distributed systems.

  • Deploy and operate services in production.

  • Work with customers to resolve their issues and gather requirements for new features.

  • Work with Microsoft stakeholders in and out of the immediate team to make sure our code is compliant and secure, as much as it solved the customer problems.


Required Qualifications:

  • 7+ years of professional software development experience.

  • Fluent in one or more programming language like Java, Golang and C#.

  • Solid data structure knowledge and familiar with common algorithms.

  • Good knowledge of OO design and basic understanding with functional programming concepts

  • Experience on designing & building large scale cloud services and software systems.

Preferred Qualifications:

  • BS/MS in Computer Science, Mathematics/Physics, or Engineering, or equivalent experience.

  • Familiar with container related technologies (Docker, Kubernetes).

  • Experience on customizing Kubernetes like developing operators and define CRDs.

  • Experience with web service development and familiar with related technologies (SpringCloud, REST).

  • Experience in data governance areas like data catalog, lineage, discovery and data policy, and familiar with related technologies like Apache Atlas, Egeria and OpenLineage.

  • Solid knowledge on distributed system.

  • Experience on designing and building large web services and familiar with related technologies (gRPC, & REST).

  • Familiar with cloud platform like Azure, AWS, GCP, AliCloud and etc.

  • Familiar with popular data store (relational, document, wide column, key-value, etc) like MySQL, Oracle, SQLServer, MongoDB, CosmosDB, Redis, Cassandra, Hbase, S3, Azure Storage, etc.

  • Familiar with big data technologies like Spark, Hadoop, Flink and Kafka.

  • Familiar with full text search technologies (Apache Lucene, Elastic Search & Apache Solr).

  • Familiar with graph database (Neo4j, Gremlin & JanusGraph, etc)

  • Familiar with machine learning and NLP.

  • Familiar with workflow related technologies (Apache Airflow, etc).

  • Familiar with service mesh technologies (Istio).

  • Familiar with modern security models like OAuth and token based authentication and authorization.

  • Experience building and shipping production grade software or services.

  • Experience using agile methodologies or test-driven development (TDD).

  • Great curiosity and willingness to question.

  • High enthusiasm, integrity, ingenuity, results-orientation, self-motivation, and resourcefulness in a fast-paced competitive environment.

  • Have a deep desire to work collaboratively, solve problems with groups, find win/win solutions and celebrate successes.

  • Get excited by the challenge of hard technical problems.

  • Solve problems by always leading with deep passion and empathy for customers.