Time travel in Databricks
Delta Lake time travel allows you to query an older snapshot of a Delta Lake table.
For
timestamp_string
, only date or timestamp strings are accepted. For example, "2019-01-01"
and "2019-01-01'T'00:00:00.000Z"
.
To query an older version of a table, specify a version or timestamp in a
SELECT
statement. For example, to query version 0 from the history above, use:SELECT * FROM events VERSION AS OF 0
or
SELECT * FROM events TIMESTAMP AS OF '2019-01-29 00:37:58'
Note
Because version 1 is at timestamp
'2019-01-29 00:38:10'
, to query version 0 you can use any timestamp in the range '2019-01-29 00:37:58'
to '2019-01-29 00:38:09'
inclusive.
DataFrameReader options allow you to create a DataFrame from a Delta Lake table that is fixed to a specific version of the table.
df1 = spark.read.format("delta").option("timestampAsOf", timestamp_string).load("/delta/events")
df2 = spark.read.format("delta").option("versionAsOf", version).load("/delta/events")
Comments
Post a Comment