Updated: July 24, 2020 (November 21, 2018)

SQL Server Big Data Architecture

Andrew analyzes and writes about Microsoft's data management, business intelligence, and machine learning solutions, as well as aspects of licensing... more

SQL Server 2019 Big Data Clusters use a multitier configuration with pools of servers performing different functions, such as ingesting, querying, aggregating, and data management.

The first tier is a Kubernetes cluster (top). The Kubernetes cluster manages SQL Server instances in Kubernetes pods, collections of Docker-style container instances that can be expanded or reduced in response to demand. The Kubernetes cluster is created and managed using a hosted Kubernetes service, such as Azure Kubernetes Services (AKS) or Amazon Elastic Container Service (EKS), or a Kubernetes service that can be deployed on-premises such as OpenShift.

The second tier is the Master pool (top), which contains the SQL Server Master instance. The Master instance is a traditional SQL Server instance with normal read-write online transaction processing capabilities and databases. It is also the head node of the Big Data Cluster, which brings new responsibilities: It receives T-SQL and Spark requests (top middle), distributes the queries to the appropriate servers in the cluster, gathers the results, and returns the data to the requesting application or user. It also exposes the underlying unstructured data as regular tables for users and developers to reference when creating queries. The Master pool can contain a single server container instance or several SQL Server container instances as part of an Always On Availability Group.

Atlas Members have full access

Get access to this and thousands of other unbiased analyses, roadmaps, decision kits, infographics, reference guides, and more, all included with membership. Comprehensive access to the most in-depth and unbiased expertise for Microsoft enterprise decision-making is waiting.

Membership Options

Already have an account? Login Now