Question no 1
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution. After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen. You have a Python script named train.py in a local folder named scripts. The script trains a regression model by using scikit-learn. The script includes code to load a training data file which is also located in the scripts folder. You must run the script as an Azure ML experiment on a compute cluster named aml-compute. You need to configure the run to ensure that the environment includes the required packages for model training. You have instantiated a variable named aml-compute that references the target compute cluster. Solution: Run the following code: Does the solution meet the goal?
    ↠ Yes
    ↠ No
Check Answer
Answer Description Explanation The scikit-learn estimator provides a simple way of launching a scikit-learn training job on a compute target. It is implemented through the SKLearn class, which can be used to support single-node CPU training. Example: from azureml.train.sklearn import SKLearn } estimator = SKLearn(source_directory=project_folder, compute_target=compute_target, entry_script='train_iris.py' ) Reference: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-train-scikit-learn
Question no 2
You create a classification model with a dataset that contains 100 samples with Class A and 10,000 samples with Class B The variation of Class B is very high. You need to resolve imbalances. Which method should you use?
    ↠ Partition and Sample
    ↠ Cluster Centroids
    ↠ Tomek links
    ↠ Synthetic Minority Oversampling Technique (SMOTE)
Check Answer
Synthetic Minority Oversampling Technique (SMOTE)
Question no 3
You create an Azure Machine Learning workspace and set up a development environment. You plan to train a deep neural network (DNN) by using the Tensorflow framework and by using estimators to submit training scripts. You must optimize computation speed for training runs. You need to choose the appropriate estimator to use as well as the appropriate training compute target configuration. Which values should you use? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.
Answer Description
Question no 4
You have a dataset that contains 2,000 rows. You are building a machine learning classification model by using Azure Learning Studio. You add a Partition and Sample module to the experiment. You need to configure the module. You must meet the following requirements: Divide the data into subsets Assign the rows into folds using a round-robin method Allow rows in the dataset to be reused How should you configure the module? To answer, select the appropriate options in the dialog box in the answer area. NOTE: Each correct selection is worth one point
Answer Description
Question no 5
You are implementing a machine learning model to predict stock prices. The model uses a PostgreSQL database and requires GPU processing. You need to create a virtual machine that is pre-configured with the required tools. What should you do?
    ↠ Create a Data Science Virtual Machine (DSVM) Windows edition.
    ↠ Create a Geo Al Data Science Virtual Machine (Geo-DSVM) Windows edition.
    ↠ Create a Deep Learning Virtual Machine (DLVM) Linux edition.
    ↠ Create a Deep Learning Virtual Machine (DLVM) Windows edition.
    ↠ Create a Data Science Virtual Machine (DSVM) Linux edition.
Check Answer
Create a Data Science Virtual Machine (DSVM) Linux edition.
Question no 6
You use Azure Machine Learning Studio to build a machine learning experiment. You need to divide data into two distinct datasets. Which module should you use?
    ↠ Split Data
    ↠ Load Trained Model
    ↠ Assign Data to Clusters
    ↠ Group Data into Bins
Check Answer
Answer Description Explanation The Group Data into Bins module supports multiple options for binning data. You can customize how the bin edges are set and how values are apportioned into the bins. References: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/group-data-into-bins
Question no 7
You have a dataset that contains 2,000 rows. You are building a machine learning classification model by using Azure Learning Studio. You add a Partition and Sample module to the experiment. You need to configure the module. You must meet the following requirements: Divide the data into subsets Assign the rows into folds using a round-robin method Allow rows in the dataset to be reused How should you configure the module? To answer, select the appropriate options in the dialog box in the answer area. NOTE: Each correct selection is worth one point.
Answer Description
Question no 8
You are implementing a machine learning model to predict stock prices. The model uses a PostgreSQL database and requires GPU processing. You need to create a virtual machine that is pre-configured with the required tools. What should you do?
    ↠ Create a Data Science Virtual Machine (DSVM) Windows edition.
    ↠ Create a Geo Al Data Science Virtual Machine (Geo-DSVM) Windows edition.
    ↠ Create a Deep Learning Virtual Machine (DLVM) Linux edition.
    ↠ Create a Deep Learning Virtual Machine (DLVM) Windows edition.
    ↠ Create a Data Science Virtual Machine (DSVM) Linux edition.
Check Answer
Create a Data Science Virtual Machine (DSVM) Linux edition.
Question no 9
A set of CSV files contains sales records. All the CSV files have the same data schema. Each CSV file contains the sales record for a particular month and has the filename sales.csv. Each file in stored in a folder that indicates the month and year when the data was recorded. The folders are in an Azure blob container for which a datastore has been defined in an Azure Machine Learning workspace. The folders are organized in a parent folder named sales to create the following hierarchical structure: At the end of each month, a new folder with that month’s sales file is added to the sales folder. You plan to use the sales data to train a machine learning model based on the following requirements: You must define a dataset that loads all of the sales data to date into a structure that can be easilyconverted to a dataframe. You must be able to create experiments that use only data that was created before a specific previous month, ignoring any data that was added after that month. You must register the minimum number of datasets possible. You need to register the sales data as a dataset in Azure Machine Learning service workspace. What should you do?
    ↠ Create a tabular dataset that references the datastore and explicitly specifies each 'sales/mm-yyyy/ sales.csv' file every month. Register the dataset with the name sales_dataset each month, replacing the existing dataset and specifying a tag named month indicating the month and year it was registered. Use this dataset for all experiments.
    ↠ Create a tabular dataset that references the datastore and specifies the path 'sales/*/sales.csv', register the dataset with the name sales_dataset and a tag named month indicating the month and year it was registered, and use this dataset for all experiments.
    ↠ Create a new tabular dataset that references the datastore and explicitly specifies each 'sales/mm-yyyy/ sales.csv' file every month. Register the dataset with the name sales_dataset_MM-YYYY each month with appropriate MM and YYYY values for the month and year. Use the appropriate month-specific dataset for experiments.
    ↠ Create a tabular dataset that references the datastore and explicitly specifies each 'sales/mm-yyyy/ sales.csv' file. Register the dataset with the name sales_dataset each month as a new version and with a tag named month indicating the month and year it was registered. Use this dataset for all experiments, identifying the version to be used based on the month tag as necessary.
Check Answer
Create a tabular dataset that references the datastore and specifies the path 'sales/*/sales.csv', register the dataset with the name sales_dataset and a tag named month indicating the month and year it was registered, and use this dataset for all experiments.
Answer Description Specify the path. Example: The following code gets the workspace existing workspace and the desired datastore by name. And then passes the datastore and file locations to the path parameter to create a new TabularDataset, weather_ds. from azureml.core import Workspace, Datastore, Dataset datastore_name = 'your datastore name' # get existing workspace workspace = Workspace.from_config() # retrieve an existing datastore in the workspace by name datastore = Datastore.get(workspace, datastore_name) # create a TabularDataset from 3 file paths in datastore datastore_paths = [(datastore, 'weather/2018/11.csv'), (datastore, 'weather/2018/12.csv'), (datastore, 'weather/2019/*.csv')] weather_ds = Dataset.Tabular.from_delimited_files(path=datastore_paths)
Question no 10
You use the following code to run a script as an experiment in Azure Machine Learning: You must identify the output files that are generated by the experiment run. You need to add code to retrieve the output file names. Which code segment should you add to the script?
    ↠ files = run.get_properties()
    ↠ files= run.get_file_names()
    ↠ files = run.get_details_with_logs()
    ↠ files = run.get_metrics()
    ↠ files = run.get_details()
Check Answer
files= run.get_file_names()
Answer Description Explanation You can list all of the files that are associated with this run record by called run.get_file_names() Reference: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-track-experiments
Question no 11
You collect data from a nearby weather station. You have a pandas dataframe named weather_df that includes the following data:
Answer Description
Question no 12
You are creating a new Azure Machine Learning pipeline using the designer. The pipeline must train a model using data in a comma-separated values (CSV) file that is published on a website. You have not created a dataset for this file. You need to ingest the data from the CSV file into the designer pipeline using the minimal administrative effort. Which module should you add to the pipeline in Designer?
    ↠ Convert to CSV
    ↠ Enter Data Manually
    ↠ Import Data
    ↠ Dataset
Check Answer
Answer Description Explanation The preferred way to provide data to a pipeline is a Dataset object. The Dataset object points to data that lives in or is accessible from a datastore or at a Web URL. The Dataset class is abstract, so you will create an instance of either a FileDataset (referring to one or more files) or a TabularDataset that's created by from one or more files with delimited columns of data. Example: from azureml.core import Dataset iris_tabular_dataset = Dataset.Tabular.from_delimited_files([(def_blob_store, 'train-dataset/iris.csv')]) Reference: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-create-your-first-pipeline
Question no 13
You are developing a hands-on workshop to introduce Docker for Windows to attendees. You need to ensure that workshop attendees can install Docker on their devices. Which two prerequisite components should attendees install on the devices? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.
    ↠ Microsoft Hardware-Assisted Virtualization Detection Tool
    ↠ Kitematic
    ↠ BIOS-enabled virtualization
    ↠ VirtualBox
    ↠ Windows 10 64-bit Professional
Check Answer
BIOS-enabled virtualization
Windows 10 64-bit Professional
Answer Description C: Make sure your Windows system supports Hardware Virtualization Technology and that virtualization is enabled. Ensure that hardware virtualization support is turned on in the BIOS settings. For example:
E: To run Docker, your machine must have a 64-bit operating system running Windows 7 or higher. References: https://docs.docker.com/toolbox/toolbox_install_windows/ https://blogs.technet.microsoft.com/canitpro/2015/09/08/step-by-step-enabling-hyper-v-for-use-on-windows-10/
Question no 14
You are creating an experiment by using Azure Machine Learning Studio. You must divide the data into four subsets for evaluation. There is a high degree of missing values in the data. You must prepare the data for analysis. You need to select appropriate methods for producing the experiment. Which three modules should you run in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order. NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.
Answer Description
Question no 15
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution. After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen. You are using Azure Machine Learning to run an experiment that trains a classification model. You want to use Hyperdrive to find parameters that optimize the AUC metric for the model. You configure a HyperDriveConfig for the experiment by running the following code: Does the solution meet the goal?
    ↠ Yes
    ↠ No
Check Answer
Answer Description Explanation Explanation Use a solution with logging.info(message) instead. Note: Python printing/logging example: logging.info(message) Destination: Driver logs, Azure Machine Learning designer Reference: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-debug-pipelines
Why You Need Dumps?
Dumpsbox provides detailed explanations and insights for each question and answer in their Microsoft DP-100 study materials. This allows you to understand the underlying concepts and reasoning behind the correct answers. By gaining a deeper understanding of the subject matter, you will be better prepared to tackle the diverse range of questions that may appear on the Microsoft Azure exam.
Real Exam Scenario Simulation:
One of the key features of Dumpsbox is the practice tests that simulate the real exam scenario. These Designing and Implementing a Data Science Solution on Azure braindumps are designed to mirror the format, difficulty level, and time constraints of the actual DP-100 exam. By practicing with these simulation tests, you can familiarize yourself with the exam environment, build confidence, and improve your time management skills.
65 + Persons Passed in Last 3 Months