The purpose of this exercise is to get you focused on some key questions that may come up during an interview. 🚀

You may not know answers to the questions below, let alone how it works within code. That's okay! Do some research of your own and try to understand the concepts.

Copy the below questions into Microsoft Word, Google Docs, or duplicate the page into your own Notion page. Then under each question write out an answer to the question in 2 - 3 lines.

As you learn more technologies and tools, you can then add to the list below. Over time, you will then build a list of key concepts that you can explain.

  1. What is an ETL? Which ETL tools have you worked with? Do you have a favorite one? If so, why?
  2. What is a NoSQL database? Tell me about a situation where building a NoSQL database was a better solution than building a relational database.
  3. How Does a Data Warehouse Differ from an Operational Database?
  4. What is unstructured data and structured data? Discuss a time when you transformed unstructured data to structured data?
  5. What is Hadoop? Describe its components.
  6. Which Python libraries would you utilize for proficient data processing?
  7. What are the big four V's of data?
  8. What is the difference between Star schema and Snowflake schema?
  9. What non-technical skills do you think comes in most handy as a data engineer?
  10. What is NameNode? What type of data is stored in NameNode?

...Add your own questions below and keep a list of other interview questions