How to Drop Array/Map Nested Struct Fields in Databricks
2025/06/11
Environment
Databricks Runtime 16.4LTS
The problem
Just as the title says, sometimes we need to drop some highly nested fields in a Delta Lake table. If the fields only belong to nested structs, then just separating the path by dots is enough (for example a.b.c.d
). But when fields are nested in an array or map, then the SQLs are not very intuitive to write.
The official documentations of Delta Lake and Databricks didn’t provide good examples of how to drop a nested struct field within an Array. Though it provides an article about how to do it with the Map type here.
In that article, the correct SQL query to drop the b
in data map<int, map<int, struct<a: int, b: string>>>
is DROP COLUMN data.value.value.b
. The interesting part is where does the value
field come from. Yes we all know a map contains key-value pairs and the value
comes from the value part of the map, but no documentation mentioned the corresponding name.
Then what if we need to delete a field in a key struct? It’s easy to guess and the answer is using key
as in data.key.key.foo
.
From the Databricks UI table overview tab, we can see a Map structure contains keys
and values
fields, just corresponding the key
and value
in above SQLs, which is just a coincidence.

Why this is a coincidence? Because If we check the structure of a array<array<struct<a: int, b: string>>>
, we’ll get items:

Then if we remove the plural form and use DROP COLUMN data.item.item.a
, an error will occur.
So the keyword is not item
for Array.
The true source of the internal field name lies in Parquet’s logical type documentation, where List(alias Array) has an element
field, and Map has key
and value
fields. So the right keyword for Array is element
.
<list-repetition> group <name> (LIST) {
repeated group list {
<element-repetition> <element-type> element;
}
}
<map-repetition> group <name> (MAP) {
repeated group key_value {
required <key-type> key;
<value-repetition> <value-type> value;
}
}
Full examples
Drop map struct fields
CREATE OR REPLACE TABLE nested_table1 (id int, data map<int, map<struct<c: int, d: string>, struct<a: int, b: string>>>);
INSERT INTO nested_table1
SELECT
111 AS id,
map(2, map(struct(4, "aaa"), struct(5, "bbb"))) AS data;
ALTER TABLE nested_table1 SET TBLPROPERTIES ('delta.columnMapping.mode' = 'name');
ALTER TABLE nested_table1 DROP COLUMNS (data.value.key.d, data.value.value.a);
SELECT * FROM nested_table1;
Drop array struct fields
CREATE OR REPLACE TABLE nested_table2 (id int, data array<array<struct<a: int, b: string>>>);
INSERT INTO nested_table2
SELECT
111 AS id,
array(array(struct(4, "aaa"), struct(5, "bbb"))) AS data;
ALTER TABLE nested_table2 SET TBLPROPERTIES ('delta.columnMapping.mode' = 'name');
ALTER TABLE nested_table2 DROP COLUMN data.element.element.a;
SELECT * FROM nested_table2;