Data modeling and query language for data analysis
Abstract
There are a huge number of programming languages which were created for developers over the past 50 years, but there is only one main language used for interpreting, analyzing and manipulating data, that is SQL. SQL is a dominant language used by developers and data analysts to work with data in relational databases [1] due to historical reasons, but it has many inherent weaknesses. In fact, SQL’s beginnings as a “simple, ad hoc” language coupled with “design by implementation” and using SQL for data analytics is not all sunshine and roses. Today, with the rise of data modeling approach for organizing and analyzing data, SQL is too low level to be seen as a suitable language for the task.
My thesis explores the new idea of “analytics as code” by combining a data modeling approach with a new, better query language for data analysts, starting with elaborating SQL’s weaknesses, then exploring the existing alternatives and finally proposing a new analytics modeling and query language which called AML that fits well with data model architecture.
Keywords: data modeling approach, functional query language, analytics as code.