Creating Tables and Loading Data
To get started, simply navigate to the Google Cloud Console, by default a new project is created for you. However, you can create a new project by clicking on “Select a project” at the top of the page and then “New Project”. Name your product and note the Project ID for later use. However, by default BigQuery creates a project when you signup for the first-time. Instead of creating a new one we will be using the default.
Before loading data into BigQuery, we need to understand the structure of how data is loaded into BigQuery. When you upload data into BigQuery it is stored inside a folder called a Dataset. Every data one uploads is organized into a folder(the Dataset), so prior to uploading the data/table we need to create a Dataset. Create one by clicking on the explorer, and create a new database.
Since we are dealing with powerlifting dataset, lets label it as follows, am going to choose “Multi-region” for the Location Type as this allows for more bandwidth and “US” which allows Google to store our data on multiple servers across the US. Am going to leave the Enable Table Expiration disabled as this deletes the table after a certain amount of time. Finally, just click the Create Dataset option.
To create a table within our dataset, click on the dataset we created and in the window choose “Create Table”. In the new prompt is where we choose how we want to load our data. Since am uploading from my local machine, am going to choose “Upload”, and then the Destination am going to choose the same Project title, for the tablename am going to name it "sourcepowerlifting”. For the Schema part, the table currently does not have a Schema.
Schema essentially defines how data is organized within a relational database. This includes logical constraints such as table names, fields, data types, and the relationships between these entities.
BigQuery provides one with two options for choosing the schema, either manually or it can detect the schema automatically, let’s go with the latter, and then leave the rest as default and then finally “Create Table”.
When we click on our table we can now see the Schema information, and the table details. One particular key information will be the Table ID which will be useful in referencing our table to run the queries.
Lets load our second table, this time it will be faster, just click on “+Add” button and choose “Local File”. Note there is the option to add data from your Google Cloud Storage or connect to External data sources such as PostgreSQL, MySQL, BigLake among others. For us its just local file, am going to choose Dataset as “powerlifting” and table as “meets” and schema as “autodetect”.
Let’s Get Querying
To start querying, click on the “sourcepowerlifting” table we created and copy the ‘Table ID” click the “Query” option above and choose in “New Tab”.
Now we can query!