![]() Data Source: Uber data set, provided in the Git repo.The project involves the following steps: This currently being the most popular.īy following the steps outlined in the video, you’ll learn how to build a data pipeline, model data and so much more. It was created by Darshil who makes, probably the best data engineer project videos on Youtube. In this project, you will explore an Uber-like dataset. Uber Data Analytics | End-To-End Data Engineering ProjecT Let’s dig into another great example of a data engineering project. Who is currently taking the lead in terms of bettings lines?īut this is just a basic example of a data engineering project. The DAG includes tasks for scraping data and a dummy task called “ready,” which aids in referencing the last task of a DAG.įrom there you can take said data to build a data visualization of what is going on in the political landscape. Then you will use Airflow to create a DAG (Directed Acyclic Graph) that defines the workflow for the project. He explains the structure of the data and how it will be loaded into S3. You’ll use these tools to create a basic function for scraping JSON data from a Predicted API endpoint. Visualization: Tableau will be used for data visualization purposes.Data Transformations: There are a lot of ways you can transform data, but you can always use Snowflake’s tasks to perform data transformations.From there, the data will be transferred to Snowflake for analytical storage. Data Storage: The raw data will be stored in an S3 bucket. ![]() ![]() Ingestion Method: You’ll use a Python operator in Airflow and run it in Managed Workflows for Apache Airflow (MWAA) for ease of use. ![]() Data Source: The data source for this is PredictIt, a marketplace for political predictions.So this first project has you pull data for your data engineering project from Predictit. Whether it’s Netsuite or Lightspeed POS, you’ll likely need to create a data connector to pull said data. Pulling data from APIs and storing it for later analysis is a classic task for most data engineers. Predictit Data With Airflow And Snowflake In particular, creating some sort of end visual, especially if it involves creating a basic website to host it can be a fun way to show your projects off.īut enough talk, let’s dig into some ideas for your data engineering projects. Use Of Multiple Tools (Even if some tools may not be the perfect solution, why not experiment with Kinesis or Spark to become familiar with them?)Įach of these areas will help you as a data engineer improve your skills and understand the data pipeline as a whole.Data Visualization (So you have something to show for your efforts).Data Storage – such as Snowflake, BigQuery, Apache Iceberg or Postgres.Data Ingestion And Orchestration Tools Like Mage and Airflow.Multiple Types Of Data Sources(APIs, Webpages, CSVs, JSON, etc).When you look to build a data engineering project there are a few key areas you should focus on. What should you look for in a data engineering project?īefore starting I wanted to help provide a few tips. So to help inspire you I have collected several examples of data engineering projects that should help drive you forward. Determining the subsequent steps once the data is acquired.Selecting the appropriate tools to employ. ![]()
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |