Updated: July 27, 2020 (January 2, 2020)

SQL Server 2019 Big Data Clusters

Andrew analyzes and writes about Microsoft's data management, business intelligence, and machine learning solutions, as well as aspects of licensing... more

SQL Server 2019 introduces Big Data Clusters, a new deployment option that allows SQL Server to work with data lakes and unstructured data on its own.
Performance gains are likely for workloads that combine relational data with unstructured data.
A centrally controlled data lake could help customers share unstructured data more securely for multiple applications.
Administrators will have to master new database management technologies to take advantage of new Big Data Cluster capabilities.

SQL Server 2019 introduces Big Data Clusters, a deployment option that enables SQL Server to create, manage, and query large, unstructured data sources, such as a data lake, on its own. Key changes include support for the Hadoop Distributed File System (HDFS), which sits alongside the SQL Server database engine and removes the need to deploy an external unstructured data management system. Also included are Apache Spark for machine learning workloads and scale-out features optimized for distributed computing and increased performance. However, the deployment option does not simplify managing the HDFS components, and it requires a container orchestration solution such as Kubernetes.

Atlas Members have full access

Get access to this and thousands of other unbiased analyses, roadmaps, decision kits, infographics, reference guides, and more, all included with membership. Comprehensive access to the most in-depth and unbiased expertise for Microsoft enterprise decision-making is waiting.

Membership Options

Already have an account? Login Now