You are designing a Spark job that performs batch processing of daily web log traffic.
When you deploy the job in the production environment, it must meet the following requirements:
Run once a day.
Display status information on the company intranet as the job runs.
You need to recommend technologies for triggering and monitoring jobs.
Which technologies should you recommend? To answer, drag the appropriate technologies to the correct locations. Each technology may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Select and Place:
Answer is in the explanation below.
Reference / correct answer:
Box 1: Livy
You can use Livy to run interactive Spark shells or submit batch jobs to be run on Spark.
Box 2: Beeline
Apache Beeline can be used to run Apache Hive queries on HDInsight. You can use Beeline with Apache Spark.
Note: Beeline is a Hive client that is included on the head nodes of your HDInsight cluster. Beeline uses JDBC to connect to HiveServer2, a service hosted on your HDInsight cluster. You can also use Beeline to access Hive on HDInsight remotely over the internet.
You have data on the 75,000 employees of your company. The data contains the properties shown in the following table.
You need to store the employee data in an Azure Cosmos DB container. Most queries on the data will filter by the Current Department and the Employee Surname properties.
Which partition key and item ID should you use for the container? To answer, select the appropriate options in the answer area.
You are designing the security for a mission critical Azure SQL database named DB1. DB1 contains several columns that store Personally Identifiable
Information (PII) data
You need to recommend a security solution that meets the following requirements:
Ensures that DB1 is encrypted at rest
Ensures that data from the columns containing PII data is encrypted in transit
Which security solution should you recommend for DB1 and the columns? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Hot Area:
Answer is in the explanation below.
Reference / correct answer:
DB1: Transparent Data Encryption
Azure SQL Database currently supports encryption at rest for Microsoft-managed service side and client-side encryption scenarios.
Support for server encryption is currently provided through the SQL feature called Transparent Data Encryption.
Columns: Always encrypted
Always Encrypted is a feature designed to protect sensitive data stored in Azure SQL Database or SQL Server databases. Always Encrypted allows clients to encrypt sensitive data inside client applications and never reveal the encryption keys to the database engine (SQL Database or SQL Server).
Note: Most data breaches involve the theft of critical data such as credit card numbers or personally identifiable information. Databases can be treasure troves of sensitive information. They can contain customers' personal data (like national identification numbers), confidential competitive information, and intellectual property. Lost or stolen data, especially customer data, can result in brand damage, competitive disadvantage, and serious fines--even lawsuits.