Home » bigquery

bigquery

Using ‘PARTITION BY’ in BigQuery to calculate the share per group

  • by
  • 3 min read

In this blog post we’ll calculate the share of a row per group. There are multiple ways to do this: with aggregate functions and with analytic functions. We’ll go over both and demonstrate how analytic functions are far superior in terms of readability. First, let’s create a table to demonstrate… 

Dataform: Create if incremental table doesn’t exist (JS)

  • by
  • 2 min read

One of Google’s latest acquisitions, ELT data orchestration tool Dataform, supports JavaScript. This is a great way to chain Dataform’s actions. In this blog post we explore how to automatically create a destination table in your data warehouse, before an incremental definition is run. If you are interested in a… 

BigQuery: JSON_EXTRACT vs. JSON_QUERY

  • by
  • 2 min read

In this blog post, I discuss the difference between several BigQuery functions that appear to do the same: JSON_EXTRACT and JSON_QUERY — and their value counterparts JSON_EXTRACT_SCALAR and JSON_VALUE. MPP Data Warehouses are increasingly becoming data lakes. In this light, extracting strings/values from semistructured data (stored as a JSON STRING)… 

Subtracting dates in BigQuery SQL

  • by
  • 2 min read

In this blog post, I elaborate on two specific operations you’ll run into when using dates in BigQuery. Subtracting an interval from a data and calculating the difference between two dates. Subtracting a unit Subtracting a specific amount of days, weeks, months, quarters, or years from a date can be… 

Solve “invalid table reference” when streaming into BigQuery

  • by
  • 2 min read

Recently I decided to start streaming clickstream data (from this blog, indeed) into BigQuery using the insertAll method. It is a relatively cheap solution that requires almost no technical configuration. Until I bumped into an error. Let’s tackle it in this blog post. The error I encountered is: “google.api_core.exceptions.BadRequest: 400…