DP-900 Practice Questions: 15 Free Questions With Detailed Answers
Scenario-based DP-900 practice questions covering all four exam domains — core data concepts, relational data, non-relational data, and analytics workloads — with full explanations.
About These Questions
These 10 questions (with 5 more below) are written to match the style of the real DP-900 exam. They test your ability to apply concepts to realistic scenarios — not just recall definitions. Attempt each question before revealing the explanation.
A manufacturing company collects temperature readings from 10,000 IoT sensors every second. The data is analyzed immediately to detect equipment failures in real-time. Which data processing approach describes this scenario?
Reveal Answer & Explanation
Correct Answer: B
Stream processing handles data continuously as it arrives, enabling real-time analysis. IoT sensor data that must be analyzed "immediately" to detect failures is a classic stream processing scenario. Batch processing (A) processes data in scheduled intervals — the opposite of real-time. ETL (C) is a data pipeline pattern, not a processing timing approach. OLAP (D) refers to analytical query patterns, not the timing of processing.
A data team is designing a data storage solution. They need to store data for operational transaction processing (OLTP) and separate storage for analytical reporting (OLAP). Which statement correctly describes the difference between OLTP and OLAP workloads?
Reveal Answer & Explanation
Correct Answer: B
OLTP (Online Transaction Processing) workloads involve frequent, fast insert/update/delete operations on individual records — like order processing or banking transactions. OLAP (Online Analytical Processing) workloads involve complex queries that aggregate large volumes of historical data for reporting — like monthly sales analysis. The options in A reverse these definitions. Storage format (C) is reversed — OLAP typically uses columnar storage, not OLTP. D is incorrect as OLTP involves heavy write operations.
Your organization runs a SQL Server 2019 database on-premises. The operations team wants to migrate to Azure with minimal code changes and maximum compatibility with existing SQL Server features including SQL Agent jobs, cross-database queries, and CLR integration. Which Azure service should you recommend?
Reveal Answer & Explanation
Correct Answer: B
Azure SQL Managed Instance provides near 100% compatibility with SQL Server, including SQL Agent jobs, cross-database queries, CLR integration, and linked servers. These features are NOT available in Azure SQL Database (A), which is a PaaS service with a subset of SQL Server features. Azure Database for PostgreSQL (C) is a different database engine. Cosmos DB (D) is a NoSQL database — incompatible with SQL Server workloads.
A developer needs to retrieve all customer names and their order totals from an Orders table, sorted by total amount descending, for customers who have spent more than $1,000. Which SQL statement correctly achieves this?
Reveal Answer & Explanation
Correct Answer: B
The correct SQL uses GROUP BY to aggregate orders per customer, HAVING to filter on the aggregated result (cannot use WHERE for aggregate functions), and ORDER BY for sorting. A is incorrect — WHERE cannot filter on aggregate functions. C references a "Total" column that doesn't exist and GROUP BY must come before ORDER BY when using aggregates. D is syntactically invalid — HAVING must come after GROUP BY.
A social network application needs to find all users who are "friends of friends" within 3 degrees of a given user. The recommendation engine must traverse relationships between users efficiently. Which Azure Cosmos DB API should you use?
Reveal Answer & Explanation
Correct Answer: C
Graph databases are purpose-built for traversing relationships. Cosmos DB for Gremlin (Apache TinkerPop) stores entities as vertices and relationships as edges, making "find friends of friends within N degrees" a natural graph traversal query. NoSQL (A) would require complex application-level traversal logic and wouldn't scale for deep relationship queries. Cassandra (B) is a column-family store optimized for time-series and tabular data. Table API (D) is key-value — no relationship traversal capability.
A company stores infrequently accessed archive data in Azure Blob Storage. The data must be retained for 7 years for compliance, is accessed less than once per year, and cost reduction is the highest priority. Which access tier should you use?
Reveal Answer & Explanation
Correct Answer: D
Archive tier offers the lowest storage cost and is designed for data accessed less than once per year, stored for 180+ days. The 7-year retention and "less than once per year" access clearly point to Archive. The tradeoff is retrieval latency (hours, not seconds) — acceptable for compliance archives. Hot (A) is for frequent access. Cool (B) requires 30 days minimum — cheaper than Hot but more expensive than Archive. Cold (C) is between Cool and Archive on the cost/access spectrum.
A developer is choosing a Cosmos DB API for an existing application that uses MongoDB drivers and queries. The company wants to migrate to Azure with minimal code changes and wants global distribution capabilities. Which API should they choose?
Reveal Answer & Explanation
Correct Answer: B
Cosmos DB for MongoDB provides wire-level compatibility with MongoDB — existing MongoDB drivers, connection strings, and queries work without code changes. This is the explicit use case Microsoft designed it for: migrating MongoDB applications to Azure with global distribution benefits. Cosmos DB for NoSQL (A) stores JSON but uses a different query syntax — MongoDB drivers won't work. Cassandra (C) and Table (D) are completely different data models.
A data engineering team needs to copy data from an on-premises Oracle database to Azure Data Lake Storage on a nightly schedule. They also need to transform the data using Spark before loading it into Azure Synapse Analytics. Which Azure service is primarily responsible for the orchestration and scheduling of this pipeline?
Reveal Answer & Explanation
Correct Answer: B
Azure Data Factory (ADF) is Microsoft's cloud data integration service specifically designed to orchestrate data movement between sources and destinations on schedules. It can connect to Oracle, trigger Spark transformations, and load into Synapse. Azure Synapse (A) has pipeline capabilities but is primarily the analytics destination, not the orchestrator for cross-system pipelines from external sources. Microsoft Fabric (C) is a newer unified platform but ADF is the classic answer for pipeline orchestration in DP-900. Azure Stream Analytics (D) handles real-time streaming data, not scheduled batch pipelines.
A business intelligence team creates dashboards that display real-time KPIs. Multiple reports from different datasets are pinned to a single view that managers check each morning. Some visuals are from a sales report, others from a HR report. What Power BI component correctly describes this combined view?
Reveal Answer & Explanation
Correct Answer: C
A Power BI Dashboard is a single-page canvas where you pin individual visuals from multiple reports and datasets. This is exactly the scenario described — visuals from a sales report and HR report combined into one view. A Report (A) is based on a single dataset and can have multiple pages, but cannot combine visuals from different datasets. A Dataset (B) is the data model, not a visual artifact. A Dataflow (D) is a data preparation/ETL construct, not a display component.
A company wants to analyze petabyte-scale data using both SQL queries and Apache Spark in a single workspace. They need the ability to query data directly in Azure Data Lake Storage without loading it into a database first. Which Azure service best fits this requirement?
Reveal Answer & Explanation
Correct Answer: B
Azure Synapse Analytics uniquely provides: dedicated SQL pools (provisioned DW), serverless SQL (query data lake files directly without loading), and Apache Spark pools — all in one unified workspace. The ability to query data "directly in Azure Data Lake Storage without loading" refers specifically to Synapse serverless SQL. Azure SQL Database (A) is not designed for petabyte-scale analytics. Azure Databricks (C) provides Spark but not the serverless SQL-on-data-lake capability. Power BI Premium (D) is a BI tool, not an analytics compute service.
Want 500 More DP-900 Questions?
Full question bank covering all 4 DP-900 domains with practice mode and timed exam simulation.
40 questions free — no credit card required.
Start Free Practice →DP-900 Exam Question Patterns
Understanding how the exam frames questions helps you answer faster and more accurately. Here are the most common DP-900 question patterns:
"Which service should you use?" scenarios
The most common question type. A scenario describes a business requirement, and you must select the correct Azure data service. Master the differences between Cosmos DB, Azure SQL, Synapse, ADF, Databricks, and Fabric to nail these.
API selection for Cosmos DB
Given a data type or migration scenario, which Cosmos DB API? NoSQL for new document apps, MongoDB for existing MongoDB apps, Gremlin for graph, Cassandra for column-family, Table for key-value migration.
Blob Storage access tier selection
Given access frequency and retention requirements, which access tier? Hot (frequent), Cool (30+ days, infrequent), Cold (90+ days, rare), Archive (180+ days, very rare, cheapest).
Power BI component identification
Distinguish between datasets, reports, dashboards, and dataflows. Know which is the data model, which is the visual layer, and which combines multiple reports.
Related Resources
DP-900 Study Guide 2026
Complete DP-900 Azure Data Fundamentals exam prep with all domains and study plan.
Is DP-900 Worth It in 2026?
Salary data and career impact analysis for Azure Data Fundamentals certification.
DP-900 vs AZ-900 Comparison
Compare Azure Data Fundamentals and Azure Fundamentals to choose the right starting point.
Cosmos DB vs Azure SQL for DP-900
Key differences between NoSQL and relational Azure data services — a core DP-900 topic.