In this lesson, you will explore your prototype notebook, which you'll turn into a pipeline later in the course. The notebook contains a fully functional RAG prototype, ingesting and embedding book descriptions, and offering you the ability to query them for a book you'd like to read. Let's dive into the code. So let's build your rack prototype application. You'll use this prototype to learn how to translate a notebook into a pipeline consisting of two decks. And after this course, you'll be able to use airflow to orchestrate other types of AI application notebooks you've built. So don't worry if you aren't that familiar with RAC or are looking to orchestrate a different type of workflow. The principles will hold true for any pipeline in an AI application. First, you need to add the libraries used in this notebook OS and Json to interact with files as well as Json from IPython to display Json dictionaries in a structured format in the notebook and text embedding from fast embed to create vector embeddings of book descriptions, including the one of your favorite book. Those vector embeddings will need a vector database home. Because airflow can orchestrate any Python code, you can use it with any library interacting with your favorite AI tooling, including a wide variety of vector databases. In this example, you loose abbreviate vector database. Lastly, you import a helper function to suppress verbose logging. Next, a few variables are needed in several locations throughout the rack application. The name of the collection in V8 where you will store your embeddings books. In this example, the folder where your book descriptions are stored. There are already descriptions waiting to be ingested in include data and the embedding model used in this example. The popular and lightweight bag is small and the one that five, which is optimized for semantic search and retrieval. In a notebook environment, you often work with local or embedded versions of tools. This is true here as well. You create and connect to an embedded VVA database. Persisted at temp MVB eight. Pause the video and run the cell to create your V8 instance and confirm that the client is ready. Later, when you translate this notebook into airflow, Dax, you'll connect to a V8 instance running in Docker instead. Next, you'll create a collection in your V8 instance, a collection in V8 is a set of objects that share the same data structure, like a set of product descriptions, support tickets or in our case, book descriptions. The code in the cell uses the V8 client you instantiated previously to first list all collections that already exist. In this instance. The next slide fetches the names of those collections, and the if statement checks if the name of the collection you want to create already is in the list of existing collection names. If it is not, you create the collection using the dot create method and store a reference to that new collection in the variable called collection. If the collection does already exist in this V8 instance, then you can just retrieve it using the dot get method and assign it to the collection variable to interact with it later in the notebook. V8 is ready and hungry for some data. There are already a couple of descriptions of good books stored in text files in the book description folder running the cell, you can see that there currently are two files in this folder, each containing the data for several books. Let's add a third file, including books that you love. The format is simple an integer index, then three colons as a separator. The book title and its move brackets. The release year of the book and after another separator the author. And lastly, the description of the book. I'll add two of my own favorite books. The idea of the World by Bernardo Castro and Exploring the World of Lucid Dreaming by Stephen Labuschagne. The next two lines write the new book descriptions to a file in the same book description folder. You can pause the video here and add your own favorite books. You can add as many as you'd like. Just make sure to follow the described format and that each book is on a new line. Awesome! Now you just need to get this data from the text files, embed the text, and then load the embeddings to alleviate for the first step. You loop through the list of description files and read each file using that read lines. This method creates a list for each element in the list is one line in the file containing the data about one book each. Next, we use the free colon separator to gather the book titles, authors, and the description texts and format them into one dictionary per book. The list of book dictionaries for each file gets stored in a big list called list of book data. You can use the Json function from IPython to display the content of that list in an explorable, structured format. Feel free to pause the video to check that your book data was correctly added in a rack application. Vector embeddings are used to make semantic search, searching for similar texts based on an input possible. Many AI apps you interact with everyday use drag to give you better answers. For example, a chat bot on an e-commerce platform most likely has access to the vector embeddings of up to date proprietary product descriptions of all products sold, and therefore can give you more accurate shopping recommendations than a general purpose chat bot. The same principle is implemented in this prototype, so let's embed the book descriptions. Next, you instantiate the embedding model using text embedding from Fast Embed and then iterate through the list of book data to get the description of each book. That embed is the method that creates the vector embeddings. The list function is used to make sure the embeddings are converted from a generator object to a Python list. Next, you can zip the list of embeddings and the list of text from book data together, and insert the data into the Veeva database. The combined data for each book title, author, description, as well as the vector embeddings, are defined as one data object to be put into the deviate collection. V8 supports batch inserts for data objects with the Dat insert. Many method. You are now all set to get a book recommendation. I would like to read a philosophical book. This query string is embedded using the same embedding model to perform a near vector. Search for the book, with the description matching the query string most closely running the cell. I get my recommendation for the idea of the world. Definitely a very philosophical book. Pause the video and try out different queries. Experiment. If you can get a recommendation for one of your own favorite books you've added previously. All right. You are in a familiar situation. You have a working prototype of an AI application. In this case, a rack type application that can give book recommendation points, which would be very useful to integrate into a chat bot on a website for a bookstore. In a real life situation. You need new books to be regularly added to the database, ideally automatically, and with observability over whether the operations were successful. Protection against transient issues like rate limits of APIs and alerts if something goes wrong. This is where orchestration comes into play. You have to prototype. Let's turn it into a pipeline in airflow. In the next lesson, you'll learn how to create basic airflow pipelines and interact with them in the airflow UI before using the code. From this notebook to build your pipeline.

Please sign in to view this content

Learn Code

Next Lesson

Orchestrating Workflows for GenAI Applications

Introduction
Video
・
3 mins

From Notebook To Pipeline
Video
・
9 mins

Your RAG Prototype
Video with Code Example
・
8 mins

Building a Simple Pipeline
Video with Code Example
・
11 mins

Turning your Prototype into a Pipeline
Video with Code Example
・
9 mins

Scheduling and Dag Parameters
Video with Code Example
・
10 mins

Make the Pipeline Adaptable 
Video with Code Example
・
11 mins

Prepare to Fail
Video with Code Example
・
11 mins

GenAI pipelines in Real Life
Video
・
6 mins

Conclusion
Video
・
1 min

Optional: How to Set up a Local Airflow Environment
Video
・
3 mins

Appendix - Resources, Help, and Downloads
Code Example
・
10 mins

Course Feedback

Community