Skip to content
Home ยป spark

spark

How to do a SUMIF in PySpark

  • by
  • 2 min read

One of the most frequent used Excel functions is probably SUMIF and its SUMIFS variant. In this article, you’ll learn how to do exactly the same in PySpark. What is the sumif function? In Excel, the SUMIF function is an aggregation function for summing values from a column, but only… 

Spark 3.0: Solving the “dates before 1582-10-15 or timestamps before 1900-01-01T00:00:00Z” error

In the past couple of weeks, I’ve been working on a project which users Spark pools in Azure Synapse. However, this appears to be a general Spark issue. I was unable to write to delta lake using Spark because I received the following error. You may get a different result…