Overview
This course provides an introduction to data engineering with Databricks, covering key tools and frameworks such as Delta Lake, Databricks Workflows, Delta Live Tables, and Unity Catalog. Participants will learn how to ingest, transform, and manage data using Delta Lake, deploy workloads with Databricks Workflows, build efficient pipelines with Delta Live Tables, and apply data governance principles using Unity Catalog. The course includes hands-on labs and real-world applications to ensure learners develop practical skills for working with Databricks effectively.
This course prepares learners for the Associate Data Engineering certification exam and provides the foundational knowledge required to advance to the Advanced Data Engineering with Databricks course.
Prerequisites
Participants should have:
- Beginner familiarity with basic cloud concepts (virtual machines, object storage, identity management).
- Ability to perform basic code development tasks (e.g., creating compute instances, running code in notebooks, using basic notebook operations, and importing repositories from Git).
- Intermediate familiarity with SQL, including commands such as CREATE, SELECT, INSERT, UPDATE, DELETE, GROUP BY, JOIN.
- Intermediate experience with SQL concepts such as aggregate functions, filters, sorting, indexes, tables, and views.
- Basic knowledge of Python programming, Jupyter Notebook interface, and PySpark fundamentals.
If you do not have one or more of the pre-requisites QA recommends:
Target Audience
This course is designed for:
- Data Engineers who want to enhance their knowledge of Databricks and Delta Lake.
- Data Analysts looking to expand their expertise in data pipelines and transformation.
- Cloud Engineers and Developers working with big data frameworks.
- Professionals preparing for the Databricks Associate Data Engineering certification.
Delegates will learn how to
By the end of this course, learners will be able to:
- Ingest, transform, and manage data using Delta Lake.
- Deploy and monitor data workloads with Databricks Workflows.
- Build scalable data pipelines using Delta Live Tables and the Medallion Architecture.
- Apply data governance principles and manage permissions using Unity Catalog.
- Troubleshoot, optimise, and monitor data workflows in Databricks.
Outline
Data ingestion with Delta Lake
- Delta Lake and data objects
- Setting up and loading Delta tables
- Basic data transformations
- Lab: Loading data into Delta tables
- Cleaning and preparing data
- Complex transformations
- Using SQL UDFs
- Advanced Delta Lake features
- Lab: Manipulating Delta tables
Deploy workloads with Databricks Workflows
- Introduction to Databricks Workflows
- Jobs compute
- Scheduling tasks using the Jobs UI
- Lab: Creating and managing jobs in Databricks
- Exploring job features
- Conditional tasks and repairing runs
- Modular orchestration of workflows
- Best practices for Databricks Workflows
Build data pipelines with Delta Live Tables
- Understanding the Medallion Architecture
- Introduction to Delta Live Tables
- Using the Delta Live Tables UI
- Developing SQL pipelines
- Developing Python pipelines
- Running modes in Delta Live Tables
- Monitoring pipeline results and event logs
- Optional: Landing new data
Data management and governance with Unity Catalog
- Overview of data governance in Databricks
- Demo: Populating the Metastore
- Lab: Navigating the Metastore
- Organization and access patterns in Unity Catalog
- Demo: Upgrading tables to Unity Catalog
- Security and administration features
- Overview of Databricks Marketplace
- Managing privileges in Unity Catalog
- Demo: Controlling access to data
- Fine-grained access control
- Lab: Migrating and managing data with Unity Catalog
Exams and assessments
This course does not include formal assessments.
Hands-on learning
This course features:
- Interactive labs to apply concepts in a real-world Databricks environment.
- Guided exercises demonstrating how to configure and optimise Delta Lake, Workflows, and Unity Catalog.
- Real-world case studies showcasing best practices in data engineering with Databricks.
- Troubleshooting scenarios to develop problem-solving skills.

Frequently asked questions
How can I create an account on myQA.com?
There are a number of ways to create an account. If you are a self-funder, simply select the "Create account" option on the login page.
If you have been booked onto a course by your company, you will receive a confirmation email. From this email, select "Sign into myQA" and you will be taken to the "Create account" page. Complete all of the details and select "Create account".
If you have the booking number you can also go here and select the "I have a booking number" option. Enter the booking reference and your surname. If the details match, you will be taken to the "Create account" page from where you can enter your details and confirm your account.
Find more answers to frequently asked questions in our FAQs: Bookings & Cancellations page.
How do QA’s virtual classroom courses work?
Our virtual classroom courses allow you to access award-winning classroom training, without leaving your home or office. Our learning professionals are specially trained on how to interact with remote attendees and our remote labs ensure all participants can take part in hands-on exercises wherever they are.
We use the WebEx video conferencing platform by Cisco. Before you book, check that you meet the WebEx system requirements and run a test meeting to ensure the software is compatible with your firewall settings. If it doesn’t work, try adjusting your settings or contact your IT department about permitting the website.
How do QA’s online courses work?
QA online courses, also commonly known as distance learning courses or elearning courses, take the form of interactive software designed for individual learning, but you will also have access to full support from our subject-matter experts for the duration of your course. When you book a QA online learning course you will receive immediate access to it through our e-learning platform and you can start to learn straight away, from any compatible device. Access to the online learning platform is valid for one year from the booking date.
All courses are built around case studies and presented in an engaging format, which includes storytelling elements, video, audio and humour. Every case study is supported by sample documents and a collection of Knowledge Nuggets that provide more in-depth detail on the wider processes.
When will I receive my joining instructions?
Joining instructions for QA courses are sent two weeks prior to the course start date, or immediately if the booking is confirmed within this timeframe. For course bookings made via QA but delivered by a third-party supplier, joining instructions are sent to attendees prior to the training course, but timescales vary depending on each supplier’s terms. Read more FAQs.
When will I receive my certificate?
Certificates of Achievement are issued at the end the course, either as a hard copy or via email. Read more here.