To ensure each customer's sales data is written to its own Parquet file, which data pipeline configuration should you implement?

Prepare for the Fabric Analytics Engineer Associate Test with comprehensive materials. Explore flashcards, multiple choice questions, and detailed explanations. Get ready for your success!

Multiple Choice

To ensure each customer's sales data is written to its own Parquet file, which data pipeline configuration should you implement?

Explanation:
Partitioning by the customer ID on the fact table ensures each customer’s sales records are written into their own Parquet files. The fact table holds the transactional data and the customer ID ties every row to a specific customer, so configuring the pipeline to partition on that key creates separate storage partitions (and thus separate Parquet files) per customer as the data is written. This setup directly controls how the write output is organized, making per-customer files, which is exactly the goal. Partitioning on the dimension table would affect only the dimension data, not the sales facts being written, so it wouldn’t guarantee separate Parquet files per customer. Increasing the concurrency count changes how many files can be written in parallel but doesn’t ensure per-customer file separation. Adding a SecureString parameter changes configuration or security controls, not how the write output is partitioned.

Partitioning by the customer ID on the fact table ensures each customer’s sales records are written into their own Parquet files. The fact table holds the transactional data and the customer ID ties every row to a specific customer, so configuring the pipeline to partition on that key creates separate storage partitions (and thus separate Parquet files) per customer as the data is written. This setup directly controls how the write output is organized, making per-customer files, which is exactly the goal.

Partitioning on the dimension table would affect only the dimension data, not the sales facts being written, so it wouldn’t guarantee separate Parquet files per customer. Increasing the concurrency count changes how many files can be written in parallel but doesn’t ensure per-customer file separation. Adding a SecureString parameter changes configuration or security controls, not how the write output is partitioned.

Subscribe

Get the latest from Passetra

You can unsubscribe at any time. Read our privacy policy