Quick Guide & Tips

💻   Accessing Utils File and Helper Functions

In each notebook on the top menu:

1:   Click on "File"

2:   Then, click on "Open"

You will be able to see all the notebook files for the lesson, including any helper functions used in the notebook on the left sidebar. See the following image for the steps above.


🔄   Reset User Workspace

If you need to reset your workspace to its original state, follow these quick steps:

1:   Access the Menu: Look for the three-dot menu (⋮) in the top-right corner of the notebook toolbar.

2:   Restore Original Version: Click on "Restore Original Version" from the dropdown menu.

For more detailed instructions, please visit our Reset Workspace Guide.


💻   Downloading Notebooks

In each notebook on the top menu:

1:   Click on "File"

2:   Then, click on "Download as"

3:   Then, click on "Notebook (.ipynb)"


💻   Uploading Your Files

After following the steps shown in the previous section ("File" => "Open"), then click on "Upload" button to upload your files.


📗   See Your Progress

Once you enroll in this course—or any other short course on the DeepLearning.AI platform—and open it, you can click on 'My Learning' at the top right corner of the desktop view. There, you will be able to see all the short courses you have enrolled in and your progress in each one.

Additionally, your progress in each short course is displayed at the bottom-left corner of the learning page for each course (desktop view).


📱   Features to Use

🎞   Adjust Video Speed: Click on the gear icon (⚙) on the video and then from the Speed option, choose your desired video speed.

🗣   Captions (English and Spanish): Click on the gear icon (⚙) on the video and then from the Captions option, choose to see the captions either in English or Spanish.

🔅   Video Quality: If you do not have access to high-speed internet, click on the gear icon (⚙) on the video and then from Quality, choose the quality that works the best for your Internet speed.

🖥   Picture in Picture (PiP): This feature allows you to continue watching the video when you switch to another browser tab or window. Click on the small rectangle shape on the video to go to PiP mode.

√   Hide and Unhide Lesson Navigation Menu: If you do not have a large screen, you may click on the small hamburger icon beside the title of the course to hide the left-side navigation menu. You can then unhide it by clicking on the same icon again.


🧑   Efficient Learning Tips

The following tips can help you have an efficient learning experience with this short course and other courses.

🧑   Create a Dedicated Study Space: Establish a quiet, organized workspace free from distractions. A dedicated learning environment can significantly improve concentration and overall learning efficiency.

📅   Develop a Consistent Learning Schedule: Consistency is key to learning. Set out specific times in your day for study and make it a routine. Consistent study times help build a habit and improve information retention.

Tip: Set a recurring event and reminder in your calendar, with clear action items, to get regular notifications about your study plans and goals.

☕   Take Regular Breaks: Include short breaks in your study sessions. The Pomodoro Technique, which involves studying for 25 minutes followed by a 5-minute break, can be particularly effective.

💬   Engage with the Community: Participate in forums, discussions, and group activities. Engaging with peers can provide additional insights, create a sense of community, and make learning more enjoyable.

✍   Practice Active Learning: Don't just read or run notebooks or watch the material. Engage actively by taking notes, summarizing what you learn, teaching the concept to someone else, or applying the knowledge in your practical projects.


📚   Enroll in Other Short Courses

Keep learning by enrolling in other short courses. We add new short courses regularly. Visit DeepLearning.AI Short Courses page to see our latest courses and begin learning new topics. 👇

👉👉 🔗 DeepLearning.AI – All Short Courses [+]


🙂   Let Us Know What You Think

Your feedback helps us know what you liked and didn't like about the course. We read all your feedback and use them to improve this course and future courses. Please submit your feedback by clicking on "Course Feedback" option at the bottom of the lessons list menu (desktop view).

Also, you are more than welcome to join our community 👉👉 🔗 DeepLearning.AI Forum


Sign in

Or, sign in with your email
Email
Password
Forgot password?
Don't have an account? Create account
By signing up, you agree to our Terms Of Use and Privacy Policy

Create Your Account

Or, sign up with your email
Email Address

Already have an account? Sign in here!

By signing up, you agree to our Terms Of Use and Privacy Policy

Choose Your Plan

MonthlyYearly

Change Your Plan

Your subscription plan will change at the end of your current billing period. You'll continue to have access to your current plan until then.

Learn More

Welcome back!

Hi ,

We'd like to know you better so we can create more relevant courses. What do you do for work?

Join Team Success

You have successfully joined undefined

You now have access to all Pro features. Click below to start learning!
DeepLearning.AI
  • Explore Courses
  • Membership
  • Community
    • Forum
    • Events
    • Ambassadors
    • Ambassador Spotlight
  • My Learning
  • Elevate Your Career with Full Learning Experience

    Unlock Plus AI learning and gain exclusive insights from industry leaders

    Access exclusive features like graded notebooks and quizzes
    Earn unlimited certificates to enhance your resume
    Starting at $1 USD/mo after a free trial – cancel anytime
Welcome to Document AI: From OCR to Agentic Doc Extraction, built in partnership with LandingAI where I'm executive chairman. A lot of data is locked up in PDF files and other documents on our laptops, on the web, and in company's cloud storage. In this course, you'll implement document processing pipelines that transform your complex documents into LLM-ready markdown text and extracts information. for analysis. You begin by explaining traditional OCR which only extracts text and then add techniques to detect document structure and identify visual components such as tables with complex formatting like merged cells or charts with captions or forms with checkboxes. You'll implement an agentic workflow that combines layout detection with LLM-based reasoning. Then, you'll learn to use Agentic Document Extraction, or ADE, a tool designed by LandingAI that automates this entire workflow for you. I'm delighted that the instructors for this course are David Park, who's Senior Director of Applied AI, and Andrea Kropp, who is an Applied AI engineer. Both have helped many developers build complex document AI systems. Thanks Andrew. We're excited to teach this course together. Traditional OCR systems worked by segmenting documents into individual characters and then classifying each character using supervised learning. These systems could extract characters accurately, but they lacked understanding of how different parts of a document hang together, such as the document structure or tables or forms, or even what is the reading order of different elements of a document. Landing AI's ADE takes a different approach. It treats each page of a document as an image and breaks that image down into parts and then extracts from each part separately using an agentic workflow. ADE treats documents as visual objects in which meaning is encoded in layout structure and spatial relationships. It uses custom visual models that directly interpret complex tables, graphs, images, charts, and other elements. and grounds every extracted piece of text to a precise location on the page. On top of this, ADE uses an agentic orchestration layer that decomposes complex documents into smaller sections for careful examination in several steps. Humans don't process a document with a single glance. Instead, we examine the different parts of the document to pull out information piece by piece in multiple iterations. ADE works the same way. For example, given a complex document, ADE it might extract a table and then further extract the table structure, identifying rows, columns, merged cells, and so on. In this course, you'll explore traditional OCR-based processing pipelines to see what they can do and also their limitations. You'll also build your own agentic document workflow from scratch. Then you'll learn how to use ADE to process complex documents and build a pipeline to parse mixed documents and extract required key-value pairs. You'll also learn to integrate ADE extracted information into RAG applications. And then implement a cloud-based version on AWS using an event-driven architecture that automatically triggers ADE for document processing whenever new documents appear. Many people have worked to create this course. I'd like to thank from LandingAI, Ava Xia, and from DeepLearning.AI, Hawraa Salami and Christopher Policastro. In the first lesson, you start by looking at traditional OCR models. Let's go to the next video and dive in.
course detail
Next Lesson
Document AI: From OCR to Agentic Doc Extraction
  • Introduction
    Video
    ・
    3 mins
  • Document Processing Basics
    Video
    ・
    8 mins
  • Lab 1: Document Processing with OCR
    Video with Code Example
    ・
    13 mins
  • Four Decades of OCR Evolution
    Video
    ・
    8 mins
  • Lab 2: Document Processing with PaddleOCR
    Video with Code Example
    ・
    18 mins
  • Layout Detection and Reading Order
    Video
    ・
    12 mins
  • Lab 3: Building Agentic Document Understanding
    Video with Code Example
    ・
    14 mins
  • A Single API for Agentic Document Understanding
    Video
    ・
    6 mins
  • Lab 4: Document Understanding with Agentic Document Extraction
    Video with Code Example
    ・
    15 mins
  • Lab 4: Document Understanding with Agentic Document Extraction II
    Video with Code Example
    ・
    12 mins
  • Agentic Document Extraction for RAG
    Video
    ・
    14 mins
  • Lab 5: Agentic Document Extraction for RAG
    Video with Code Example
    ・
    9 mins
  • Building RAG Pipelines with Agentic Document Extraction on AWS
    Video
    ・
    14 mins
  • Lab 6: Building a Research Paper Chatbot with Strands Agents
    Video
    ・
    16 mins
  • Conclusion
    Video
    ・
    1 min
  • Quiz

    Graded・Quiz

    ・
    10 mins
  • Links & Resources
    Reading
    ・
    10 mins
  • Accomplishment
    Course Info