Building on the approach we use with planning, you're going to use o1 to help with the coding task. o1 performs best with simple, direct prompting to both create new applications and add existing ones. To test this, you'll run two coding competitions between o1-mini and GPT-4o to create a net new application and to edit some existing code. Let's get coding. In this lab, we're going to learn how to use o1 to assist us with creating an original app, using code and then editing existing code using o1. To start with, we'll import our variables. So our OpenAI API key. And again the GPT model and o1 model that we've been using so far, we'll begin by creating a react app. To start with I'm going to define a function which will allow us to get a check completion response. The next thing you'll do is define a prompt for how o1 and 4o should think about creating this app. So paste it in a prompt here where we give instructions to create an elegant, delightful react component for an interview feedback form. We give it some criteria for how this app should perform and what the overall goal is. And a couple of reminders here. I've included these because I'm working with 4o, so generally I wouldn't for o1. So now we'll generate code using both 4o and o1. And then we're going to render both apps and visually compare the quality of each. You'll begin with 4o here. We'll get it to generate the app will render the app and then we'll do o1. And hopefully, we'll see a marked improvement with the o1 model. Let's kick off the generation. Now that the code's done, let's print it and have a look. There we are. I'm just going to copy this out, render it in our app, and we'll be ready to inspect the results. I'm going to display a static image of what I've gotten when I loaded that code into the app. And here we can see an interview feedback form. But it's not so great looking some some sort of little weird formatting. So hopefully o1 is capable of improving on this. Now we'll repeat the process with o1. Our hope here is that using the same prompt will get superior results. So kick this off to generate our code. Now that that's generated will print it out to have a look at it and see that we've got some disks here. And I'm going to copy that into the app and print the image so that we can have a look at the quality of what it produced. And we can see here a, you'd have to say a superior app. So it's gone with some drop downs for the different rating numbers. It's listened to all of our instructions instead of including just a subset of categories. That's giving us a nice green feedback box as well. So a quick view of where o1 can help you when you have a high-level design for an app and you want a starting point, often o1 will get you a bit further forward than 4o will with equivalent prompting. This is a static example that I've generated, but if you want to delve into this yourself and do some prompt engineering and then render your own app, then please follow the instructions in the notebook. If you want to try out the code, you can go to the link in the notebook. Cut and paste the code into the editor. And it'll render the form. You can see the recommendation changing as we fill things in. And you can submit some feedback. If you're an expert in JavaScript or react, also in the notebook is a description of how to download a zip file containing an application you can run on your local computer. Next, we'll move on to editing some existing code and show how o1 can give useful feedback that will result in, again, superior code to that produced by 4o. To compare the editing capabilities of 4o versus o1, we've produced some code that has some clear issues, like multiple nested loops, a lack of error handling, and overall it's not super readable. What we'll do is we'll feed this code to both models and we'll see how they clean it up. Then we'll employ an o1 LLM as a judge to rate the two resulting code blocks, and tell us which one performed better. We'll begin again with a simple and direct prompt and use 4o for our first generation. So the prompt I have some code that I'd like you to clean up and improve. Return only the updated code that fixes the issues. And then we're passing in that code snippet, which we showed above will generate this, have a look at the results. You've now generated some code. And outwardly, I mean on first glance, it looks a little better. But let's generate the o1 code and then we'll compare the two with LLM as the judge. Here we're going to generate the same output, but this time using o1. And again the results look plausible here. But the best way to figure this out is to employ an LLM to compare the two. So let's do that. We're going to employ o1 as our grader here. Because this is the sort of nuanced, multi-step process where o1 does generally tend to perform better than 4o, so, again, we feel pretty good using it here. And in this case we used again, a simple and direct prompt. So which code is better. And why? Option one, 4o code and option two, the o1 code. So, we'll run that and you'll see the results. The results are in. And o1's deduced that both of them are trying to do the same thing. But there are several key differences between the two implementations that make option two better. It's called out the readability and structure, the error handling and robustness, where option two has selected a more effective way of doing that, and also the data processing and calculations. The debugging and log in, and just general performance considerations. So again, a useful example of where even if you didn't rely on 4o for a code editor or kind of a copilot to help you work through issues and edit your existing code, this is an area that o1 does fairly well at. Great. So, you should now be confident using o1 to assist you in coding. Whether it's creating net new applications or helping improve your existing code. This is an area we're seeing a ton of uptake of o1, especially in the first category where you have a high-level design and you want a really good first stab. Looking forward to being with you for the next stage where we use o1 to reason on top of images.