I spent way too large a portion of my last position teaching developers about indexes, query plans and underlying join types and their impact on performance and memory consumption.
Not Vertica though that looks very interesting. I do have a lot of experience with Redshift though. The difficulty is most implementations of data warehouses are fairly bespoke, even down to query plan and execution so knowledge on Redshift may not completely transfer to Vertica for instance.
Thanks. But how does one approach to learn the internals for these things? It's not like MySQL or SQL Server or PostgreSQL that we have tons of books and very detailed documentation. For Vertica we only have a doc, no books, just provided as is.
It seems to be the norm for everything that takes flight around 2010. Of course many are open sourced so those are OK I guess.
The thing is finding the terminology, in the case of Redshift that is Sort Key, Distribution Key and primary key (though these aren't true primary keys they do influence the query planner).