In this lesson, you'll try out one of the simplest and most impactful strategies for carbon-aware ML development. Training a model in a location that is powered by a high amount of carbon-free energy. This strategy doesn't take too much effort, but as you'll see, it's got a pretty big payoff. So let's try it out. The first thing we're going to do is train an ML model locally. So in this notebook. And we'll start off by importing some libraries. We've got numpy, sklearn and TensorFlow. And then we will create our data set. And we're going to do this using the make blobs function in sklearn. And this will make a training data set with four different categories. If you're interested in more background on these functions and the specific classification task here, this is actually the same example that is taught in Andrew's Machine Learning Specialization. So you can go and check that out. This is in the second course advanced algorithms. After we've created our data we will go ahead and create the model. And again the details here aren't super important for this lesson. Specifically, we just need to make a model. And this one has two layers in it. And once we've done that we can compile the model. So this will look familiar to you if you've used Keras before. If you haven't don't worry about it too much. Compiling is where you specify the loss function as well as the optimizer that you want to use. So we can execute that cell. And once we've done that, we've created our training data and we've created our model. And we can call model dot fit which is how we will train the model. And so we'll pass in our training data. That's X train and Y train. And we're going to let this run for 200 epochs. So this should train pretty quickly because it's a small model and not a whole lot of data. You can scroll all the way down and you can see that the loss function is decreasing. So, looks like our model is learning something. Okay. We're at the bottom here. So the model has finished training. Now, one of the benefits of training in the cloud is that you have some flexibility when selecting where to train the model. So right now if you are located in someplace that is connected to a grid with really high carbon intensity, you don't really have a whole lot of control over that electricity and where that comes from. But you can take advantage of lower carbon grids by running compute workloads in the cloud. So by using the cloud, you have the ability to specify where you actually want to run this compute, and you can select regions that just have more carbon free energy available. So that's exactly what we're going to do now. We're going to run the same training code, but we'll run it on Google Cloud. And the reason we're doing that is so that we can select a data center location that has a really low average carbon intensity. Before we do this, we need to learn a few things about Google's machine learning platform vertex AI. We're going to run all of this code on vertex, and in order to do that, we'll need to take three main steps. We'll import and initialize the vertex AI Python SDK. We'll write our training code to a file and then we'll configure and submit a training job that will run this code. So let's run through each of these steps one by one. And we'll start with step one which is initializing and setting up the vertex AI Python SDK. We have a helper function already written for you which will import credentials and project ID. So if you are unfamiliar with Google Cloud, here are a couple of basics. So, the first thing we do is we import AI platform. And this is the machine learning platform on Google Cloud. And then you'll need to call this initialize function where you pass in two things. You'll pass in the project ID and the credentials. The project ID is a reference to your Google Cloud project. That's just where all of the services and anything you run, any compute you use, that's where all of that is contained. And then these credentials here are secret credentials for authentication. So again, this has been set up for you already in the notebook. You just need to run this authenticate function. And once we've imported our credentials and project ID, we can import the AI platform library. And then we will call initialize. And so we'll say AI platform init. And we'll pass in the project ID. And then we'll also pass in our credentials. And this is so that we can access the vertex AI services from this notebook. So now that we've got the setup done we're ready to move on to step two. So instead of running the training code directly in the notebook like you just did, we're going to write that code to a file and we'll run that file on vertex AI. So let's first create a file. We will use the notebook magic called write file. And then I'm going to call this file task.py. So what this means is that whatever code I put in the cell when I execute the cell a file is going to be created called task.py where all that code will be written. And I'm saying task.py. Here you could call a file something different. On Vertex AI this is just a general convention usually the main training code is called task.py. But again it's a convention. So you could call this file something else. If you really wanted to. But this file needs to have all of the key components of our ML training application. So that means importing libraries. And then it needs to have the code to create the data set. And then just as a reminder this is all the same code. We just ran directly in the notebook. We need the code to create the model. And then we need the code to also compile the model, which is where we specify our loss function and our optimizer. And finally, when we're done with all of that, we are ready to call model.fit, which is what we'll actually train the model. So when I execute this cell here, you can see that it's created a file called task.py. And if we were to take a look, you can see this file is right here and we could even say cat task.py. And you can see that this will print out exactly the code that we specified. So we import our libraries. We create the data set, create the model and then train. So all of our code is now in this one file. So now that we've defined the training code we want to execute on vertex AI, We have to create and run the training job. And we'll do this using something called a custom training job. And the vertex AI Python SDK. So let me show you what that looks like. All right. So we're going to create a custom training job using this AI platform library. And there are a couple of important arguments we'll need. So the first is the display name. This is a string identifier for the job. I'm calling this DeepLearning.AI course example. But feel free to change the name if you like in the notebook. The next thing we need to pass in is the path to our Python file. So this is script path and that is task.py, which is that file we just wrote. And we just looked at that has all of our training code. The next thing we need to pass in is a container uri. We need a Docker image to run our training job. And we don't have to write a Docker file or anything. But vertex provides pre-built images for TensorFlow, PyTorch, sklearn, and XGBoost. So here are just some of the images. And if you click on them, you can see it here all the things that are included. So this TensorFlow image here comes with XGBoost. It comes with a couple of other important libraries that you might need. So we don't need to write our own Docker file or create a Docker image. because vertex provides some out of the box images for TensorFlow, XGBoost, PyTorch, and sklearn. So we're writing a TensorFlow model. So we can click over here to TensorFlow. And you can see this is the most recent image. And if you click over here on included dependencies we can see what else is included. So it's not just TensorFlow. We've also got XGBoost. We got a couple of other useful libraries that we might need. So if you want to use this image you just copy right over here and this is what we're using in the actual notebook. The next thing we need to pass is the location. And this is the data center location where we want this workload to run. When we pick this in the notebook, we are going to pick a location that has a lot of carbon free energy available. And then the last thing we'll need to pass here is a staging bucket. So buckets are just places in cloud storage where you can store any kind of object. And you need a staging bucket just for any intermediary artifacts that might get created during the training job. So these are all the arguments that we need to pass. So let's try this out in the notebook. We're going to start by selecting the region where we want to run this training job. Google provides average carbon intensity values for each of its data center regions. If we scroll down on this page over here, you can see the carbon data across different Google Cloud regions. So here is the name of the region. You can see where it's located. And then it'll tell you the grid carbon intensity over here, which indicates the average emissions per unit of energy from the grid where this data center is connected. So if we look at some of these regions over here, you can see that Europe West 9, is actually in Paris. In the first lesson, we took a look at France on electricity maps. And we saw that there was a lot of nuclear energy here. and so this is actually considered a low CO2 region. Some of these regions here have these low CO2 markers, and they just generally indicate that these are regions that have a lot of carbon free energy available. So this Paris region here is because it's in France and there's a lot of nuclear energy available there. We can also look at South America East 1 where you see Sao Paolo. And we took a look at Brazil on the electricity maps API as well. And this one is also pretty low CO2. It's probably from a lot of hydro and I'm guessing some solar as well. And then there's also Los Angeles, which we looked at in the previous lesson when we were checking out Santa Monica, which has average intensity of 202. but it's not considered a low CO2 region. And I'm guessing that's because the energy mix fluctuates pretty, pretty widely throughout the day. So we're going to go ahead and pick US central 1, which is in Iowa and has a lot of wind energy. It's considered a low CO2 region. We're going to use that as the region for running our ML training job. So, the first thing we'll do, is we'll define region. And we'll call this US central 1. So that's the Google Cloud region where we want to run our training job. And the next thing we'll need to do is create a cloud storage bucket. Again this is a location for storing any staging artifacts that get created along the way in our training job. Before we do that, bucket names do need to be totally unique. So we're going to create a unique ID which will append to your bucket. And this is just to avoid name collisions with other learners. So I'm going to import random import string. And then we have this handy function over here which is going to create this unique identifier. So if you execute this out you can print out you UUID here. And this is my unique identifier. So hopefully yours looks different. Otherwise this function didn't work. But this should create a unique identifier. And so we'll use this when we create our bucket name. So, in order to create the bucket the first thing we need to do is import the Storage Python SDK. So we'll say from Google Cloud Import Storage. And then we need to create our storage client. And we'll do that by calling storage dot client. And to this we need to pass in the same things we passed earlier when we were initializing vertex AI. We need to pass in our project ID. And we need to pass in our credentials. And then once we've done that we can execute this file here. And now we just need to come up with a name for our bucket. So I'm going to say bucket name. And we'll use f-string formatting. Call it Carbon Course Bucket. And we'll add to the end of it here that unique identifier. So this unique identifier is really important. If you leave it off and you just try to create a bucket called Carbon Course Bucket, or you can go with another bucket, you will probably run into a name collision. and you'll get an error later on. So make sure you, include this unique identifier here. And we can take a look. Now we have this bucket name here. And again yours will look different because you'll have a different unique identifier there. But once we've done that, we can use our storage client to create a bucket. And when we do that we will make sure we create it in a specific region. So we'll say location here equals region. And region was US central 1, which we set up here, which is where we want to run the training job. And that's important because we need to create a bucket in the same location we plan to run our training job. So once we've done all that, we are finally ready to create this custom training job. So we're going to call it job. And we'll say AI platform Custom training job. And now we'll pass in all those arguments that we took a look at on the slide. So that starts with, a display name. And this is a name that you can, make up for yourself if you want. I'm calling it DeepLearning.AI course example. The next thing we'll do is we'll pass in the path to our Python file. So that was task.py. Then we need to pass in the container uri. And this is the Docker image that vertex provides out of the box for us. So we'll just paste that in there. And then we need to specify our staging bucket. So this is going to be the name of our bucket that we just created here. It'll look something like carbon course bucket. And then your unique identifier and this gs:// is just a way of identifying a Google Cloud storage bucket. So that's what that looks like. And then lastly, we need to specify the location. And of course, this is really important because this is where we are specifying to train our model. And a region that has a lot of carbon free energy available. And that's the US central one region. So once we've done that we can execute that cell. And finally, we can call job dot run. So let's do that. Job dot run. And when we execute this cell what's happening right now is that behind the scenes vertex is provisioning compute resources. It's running the training code that we wrote and that Python task.py file. And then once all of the training is completed, it will ensure that the compute resources get deleted when the training job is finished. The logs here are going to print out a few different things. You'll see this link you won't be able to click on this link since you're using an online classroom, but just for reference, if you do this in your own projects, you'll be able to see the status of this training job. How long it's taking. This training job will probably take about 3.5 minutes to complete. You can also see the Docker image that you use to train it, and just track the progress right here in the cloud console. Now, this job is going to take probably around five-ish minutes is my guess. So, it will take a little bit longer than it did to actually just execute the cell directly in, in the notebook. And that's because provisioning the resources adds some extra overhead. That overhead is going to feel pretty dramatic here, since this model is so small and it trains so quickly. and the code runs pretty instantly in a notebook. But, that's of course not really representative of real world training scenarios. And those more realistic scenarios where your training job takes a lot longer. The overhead is amortized. that just means the overhead per training job will be much smaller over multiple jobs. Something else to keep in mind here is that you'll notice that we had to create a bucket in the same location where we actually run the training job. And that's because generally, you want to store data in the same location where you actually run your compute. You were running a really IO heavy job, you wouldn't want the carbon savings running in some faraway, low carbon region to be outweighed by moving data across the world. So, machine learning training jobs are workloads that are pretty well suited to a strategy like this because they're batch jobs and they aren't super time sensitive. They don't have these latency requirements like you might when you're serving an ML model where you probably want the region to be close to your user traffic. So for a batch job like training, you can sometimes be a little bit more flexible when choosing where to run your program and where you also want to store your data. So you'll know that the training job has completed when the logs stop printing out and it looks something like this. And that just means that we finished training our model on vertex AI. And with that you've seen one of the simplest and easiest, most effective strategies for carbon-aware computing. And that's just taking whatever you planned to run and making sure you run it in some place that has a lot of carbon free energy available. So before we wrap up this section here, we need to do one more thing, which is delete the bucket that we just created, just to clean up all these extra resources in this online classroom environment. So we'll call bucket Delete. And we're saying force equals true because the training job did right. Some kind of intermediary staging artifacts and files to this bucket. So, in this lesson, we took one of the simplest, but also one of the most effective measures for trying to reduce our carbon impact from our ML training jobs. We took a look at the Google Cloud documentation to find a region that had an average low carbon intensity. And that's where we trained our ML model. Again, this is a really effective strategy for lowering your carbon footprint when it comes to compute. In the next lesson, we're going to try something a little different. And we're going to use some real-time data to inform our decision of where to train. So I'll see you in that next lesson.