So far in the course, you've been working with a cleaned CSV file called Customer Reviews. In this video, you're going to upload that same file into Snowflake, and turn it into a proper table. In the next video, you'll recreate that same dataset by combining a messy pile of word docs. But for now, let's start simple and get the clean version loaded so you can get familiar with the process from start to finish. First, let's check in with your MVP build plan from Module 1. Now you've already worked through this pipeline in Module 1, but this time, we're starting fresh in Snowflake. So you can see how the full upload and table creation process works in this environment. It's time to tackle Step 1, getting data into Snowflake. If you've worked in data science or analytics, chances are you've handled a lot of tabular data, usually in the form of CSVs, Excel files, or other spreadsheet style formats. Snowflake supports all of the most common formats and many more, such as structured files like CSV and TSV, your classic spreadsheet formats. These are great for clean, organized data. Semi-structured data like JSON, Avro, Parquet, and XML. These are more flexible formats, often used for logs, events, or nested information, and text documents, PDFs, and Word files. If you're coming from a traditional data science background, this might be less familiar. But thanks to GenAI, it's now much easier to analyze and extract insights from this kind of data. Before we upload anything, let's quickly review how Snowflake organizes your data. Think of it like an office filing cabinet. Databases are like the cabinet itself, a top-level container for a project. Schemas are folders inside that cabinet to keep things organized. Tables, Views, and Stages. Your actual data and tools for managing it. Tables are the rows and columns of structured data. Views are saved queries that look like tables. Stages are upload zones for files before loading them into tables. This structure keeps your data clean, modular, and easy to navigate. And now, you're going to set it up for yourself. Now, according to this organization, you're going to create three things. A database to store your project. A schema to organize your files. And a stage where you'll upload the avalanche-customer-reviews.csv file. This setup mirrors the most typical project setup in Snowflake. First up, the database. This is your top-level container for everything related to the Avalanche app. To create a new database, in Snowsite, click on the Data tab in the left sidebar. This will open the Databases window. Click the plus database icon in the top right corner. In the Database window, give your database a name like avalanche underscore db. Then click on Create Database. Step one, check. Over here on the left, you now have a new Avalanche database that you'll use as a top-level container to store all the Avalanche tables and schemas. Now, let's add a schema to keep things organized. This is where you will store your raw files and tables. Click on your new Avalanche db in the left-hand sidebar. On the top right of the Avalanche db window that opens up, click on plus schema. Name your schema something like avalanche underscore schema. Then click Create. Now you've got a place to organize raw files separately from clean ones or anything else you add later. Finally, let's create a stage. This is especially useful if you plan to reuse files across notebooks or across users in the same workspace. This is your upload zone. It's the place to hold raw files before they're loaded into tables. Click on the Avalanche schema that is now listed under your Avalanche db. On the top right of the screen, click on the blue Create button. From the drop-down menu, choose Stage and then Snowflake Managed. Unless you have a reason to select Externally Managed Storage, this will be the easiest option to configure. From the Create Stage window, name your stage something like avalanche underscore stage. If the schema isn't already pointing to avalanche db.avalanche schema, update it now. Select Server Side Encryption, and you can leave everything else at default. Then click on the bright blue Create button on the bottom right. When your stage is ready, it's time to upload customerreviews.csv to the stage you just created. Once the Avalanche stage is ready, you'll be taken to the Settings window. On the top right of the screen, click on the Plus Files button. Drag and drop or browse to the location where you cloned the course repo and select customerreviews.csv. Click on the Upload button. Your file is now stored securely in Avalanche stage and ready to be referenced in queries and scripts. Now that your data is on Snowflake, the rest of the work can be done in a Snowflake notebook. Snowflake notebooks are like Jupyter notebooks, but hosted in Snowflake. You can write and run both Python and SQL, visualize data, and work directly with your Snowflake environment. To open a new notebook, in the left-hand Snowflake sidebar, click Projects. Then click on Notebooks. On the top right of your screen, click the Plus Notebook button. Name your notebook something like Avalanche Customer Reviews. Select your Avalanche database and schema. Leave your runtime option at Run on Warehouse. This is your best option for data analysis with Python because it comes with most of your data science packages pre-installed. Leave everything else at default. Then click on Create. Once your notebook opens, you'll be taken to the main editor window, which is where you write your code. Your notebook will open with a few cells of example code in both Python and SQL, shown here on the top left of each cell. To run a cell, hit Shift-Enter on your keyboard, or press the Run button on the top right of the screen. You can try that now to see how it works. The first code cell is how you connect to any data that is stored on Snowflake. So leave that in place for grabbing the customer reviews. But now, you can go ahead and delete the last two example code blocks by clicking on the three-dot menu on the top right of each block and selecting Delete. Now that you've set up your database, schema, and stage, you're ready to load your Customer Reviews.csv file into a data frame inside your Snowflake notebook. In Snowflake, you'll mostly be working with two types of data frames, Pandas Data Frames and Snowpark Data Frames. They look similar and support many of the same operations, but under the hood, they behave very differently. Let's break it down. Pandas Data Frames run everything immediately on your local machine. That's great for quick analysis on small data sets, but they can slow down or crash when the data gets large. Snowpark Data Frames don't execute right away. Instead, they build a query plan, a kind of blueprint that describes what should happen, like filtering, joining, or transforming data, but doesn't actually run anything until you ask for a result. Then, when you're ready, the entire plan is sent to Snowflake's cloud infrastructure and executed all at once, right where the data lives. That means no downloading, no memory overload, and way more speed at scale. So which one should you use? The following table gives you a good overview of the differences. Use Pandas for fast local testing on small files. Use Snowpark Data Frames when you're working with larger data sets or when you want to tap into the full power of Snowflake's compute engine. Now that you understand how Snowflake handles data frames, let's put that into practice by loading your CustomerReviews.csv file into a Snowpark Data Frame. The first step is to connect your Snowflake notebook to your project environment. Snowflake handles this for you automatically using something called an active session. Each Snowflake notebook will start out automatically populated with this code block. Let's run this code block by clicking on the play button on the top right. This code begins by importing your core libraries, Streamlet and Pandas, just like you've done before. Then, you can call getActiveSession from the Snowpark library. This creates a direct connection to your Snowflake project. Once that session is active, you are fully plugged into Snowflake's backend. That means you can now query existing tables, load files from a stage, write data into new tables, and run everything inside Snowflake's cloud instead of on your local machine. Now that you're connected to Snowflake and have a session ready to go, it's time to load your data. Let's pull the CustomerReviews.csv file into a Snowpark Data Frame so you can start working with it directly inside your notebook using this line of code. This line loads your CSV file directly from a Snowflake stage into a Snowpark Data Frame so you can preview and work with it inside your Python code. Session.read tells Snowpark you're about to read data into a new Data Frame. Options.parenthesis.inferSchema.true is a helpful shortcut to have Snowflake automatically detect the column names and their data types based on the contents of your CSV. Without using this option, everything would be loaded in as plain text. CSV.parenthesis.atAvalancheStageCustomerReviews.csv points directly to the file you uploaded earlier. The Avalanche Stage is your named stage, the secure upload zone you created. CustomerReviews.csv is the file sitting inside it. Finally, df.show lets you preview the top few rows of your dataset just like you would with df.head in Pandas. At this point, you're reading Snowflake-hosted data into your notebook, analyzing it with Python, and previewing the results, all without ever leaving the cloud. Let's click on play. Before you move on, it's worth understanding how file paths work in Snowflake, especially when you're working with stages. When you see a file path like this, here's what each part means. The at symbol tells Snowflake you're referring to a stage, your storage area for files. Avalanche Stage is the name of the stage that you created earlier. It's where you uploaded your CustomerReviews.csv file. CustomerReviews.csv is the name of the file you placed inside that stage. If you had uploaded your file to a subfolder in the stage, the path might look like this. Snowflake treats stages kind of like cloud directories. You can organize files inside them using folder-like paths, even though the underlying storage is flat. As you work with multiple files in later lessons, understanding how these paths work will help you stay organized and avoid errors. Now that you've previewed your CSV files and confirmed everything looks right, it's time to make your data permanent. Right now, your data only lives in memory inside your notebook, which is like a temporary scratchpad. To make it truly useful, you'll want to turn it into a table inside your Snowflake database. Here's the code to do that. This line tells Snowpark, CustomerReviews is the name of the table you're creating in your Snowflake database. Mode equals override tells Snowflake, if a table with this name already exists, replace it with this one. Once you run this command, your data frame is no longer temporary. It becomes a real permanent table stored in your project's database and schema. This means you can write SQL queries against it. You can connect it to Streamlit apps. You can join it with other datasets. And you can share it with your team or other tools inside your Snowflake environment. This is the key step that turns raw data into a structured Snowflake native resource, ready to be queried, visualized, and built upon. Don't forget to run this cell. Nice work. In this lesson, you created a new database, schema, and stage to organize your data. Uploaded your CustomerReviews CSV file using Snowflake Notebooks. Loaded the CustomerReviews CSV file into a Snowpark data frame directly from the stage. Previewed the data to verify it looked good. And saved it as a permanent, queryable table inside Snowflake. You've officially completed step one of your MVP build plan, getting data into Snowflake. In the next video, you'll level up by taking a pile of raw Docx CustomerReviews and transforming them into clean, structured data using GenAI-powered tools from Snowflake Cortex.

Fast Prototyping of GenAI Apps with Streamlit

Intermediate

9 hours 21 mins

Topics

Chatbots

GenAI Applications

Prompt Engineering

RAG

Search and Retrieval

Collaborator

Snowflake

Module 2: Fast Prototyping with Streamlit in Snowflake

Building Prototypes in Snowflake

Building Prototypes in Snowflake
Video
・
1 min

Introducing Snowflake
Video
・
5 mins

Getting Started with Streamlit in Snowflake
Reading
・
10 mins

Snowsight Development Environment
Video
・
7 mins

[Optional] Snowsight resources
Reading
・
10 mins

Level Up: Snowflake Ecosystem
Reading
・
30 mins

From CSV to Cloud – Using Notebooks to Ingest Avalanche Data
Video
・
12 mins

Level Up: Data Loading on Snowflake
Reading
・
30 mins

Uploading a Batch of Files
Video
・
4 mins

From Stage to Table with Cortex
Video
・
3 mins

Extracting Information from the Content
Video
・
3 mins

Lab 1 instructions: Avalanche Shipping Analytics
Reading
・
30 mins

Lab 1: Avalanche Shipping Analytics
Video
・
5 mins

Data Analysis with Snowflake and GenAI

One Table to Rule Them All
Video
・
5 mins

Sentiment Analysis with Cortex
Video
・
3 mins

Data Visualization in Snowflake
Video
・
3 mins

Building your AI-Powered Streamlit App Inside Snowflake
Video
・
4 mins

Lab 2 instructions: Using GenAI for Sentiment Analysis
Reading
・
30 mins

Lab 2 Overview: Using GenAI for Sentiment Analysis
Video
・
3 mins

Graded Quiz

Module 2 Quiz

Graded・Quiz

・

30 mins

Lecture Notes (Optional)

Module 2 lecture notes
Reading
・
1 min

Module 3: Iterative Improvement