Creating a DataSet¶
A DataSet is a definition of a collection of data that can be queried and transformed into widgets. It consists of four main components: A database connector, a primary database table, join tables, and fields. Once a DataSet has been defined, it can be queried to generate a large variety of visualizations.
Some Definitions
- Database Connection
- The database connector is a connection to a database. It contains all of the connection details and supplies the functions for connecting to the database. It also has some helper functions that create SQL where database platforms deviate.
- Table
- The base table to query from in your database. This is the table that goes in the
FROM
clause of the SQL queries generated by fireant. - Join
- Joins specify how to join additional tables. They are instantiated with another PyPika Table and a PyPika expression on how to join the two tables. Joins can also join based on another join by using an expression that links it to the other join table (see below for an example). fireant Will automatically determine which joins are necessary on a per query basis.
- Field
- Fields are the bread and butter of fireant. The define what types data is available and are ultimately what is referenced when building up queries. Fields are defined with a PyPika expression.
Code Example¶
from fireant.dataset import *
from fireant.database.vertica import VerticaDatabase
from pypika import Tables, functions as fn
vertica_database = VerticaDatabase(user='jane_doe', password='strongpassword123')
analytics, customers = Tables('analytics', 'customers')
dataset = DataSet(
database=vertica_database,
table=analytics,
joins=[
Join(customers, analytics.customer_id == customers.id),
],
fields=[
# Non-aggregate definition
Field(alias='customer',
definition=customers.id,
label='Customer'),
# Date/Time type, also non-aggregate
Field(alias='date',
definition=analytics.timestamp,
type=DataType.date,
label='Date'),
# Aggregate definition (The SUM function aggregates a group of values into a single value)
Field(alias='clicks',
definition=fn.Sum(analytics.clicks),
label='Clicks'),
],
)