2 posts tagged

bootstrap

How to build a dashboard with Bootstrap 4 from scratch (Part 2)

⏱ Время чтения текста – 12 минут

Previously we shared how to use Bootstrap components in building dashboard layout and designed a simple yet flexible dashboard with a scatter plot and Russian map. In today’s material, we will continue adding more information, explore how to make Bootstrap tables responsive, and cover some complex callbacks for data acquisition.

Constructing Data Tables

All the code for populating our tables with data will be stored in get_tables.py , while the layout components areoutlined in application.py. This article will cover the process of creating the table with top Russian Breweries, however, you can find the code for creating the other three on Github.

Data in the Top Breweries table can be filtered by city name in the dropdown menu, but the data collected in Untappd is not equally structured. Some city names are written in Latin, others in Cyrillic. So the challenge is to make the names equal for SQL queries, and here is where Google Translate comes to the rescue. Though we sill have to manually create a dictionary of city names, since for example “Москва” can be written as “Moskva” and not “Moscow”. This dictionary will be used later for mapping our DataFrame before transforming it into a Bootstrap table.

import pandas as pd
import dash_bootstrap_components as dbc
from clickhouse_driver import Client
import numpy as np
from googletrans import Translator

translator = Translator()

client = Client(host='12.34.56.78', user='default', password='', port='9000', database='')

city_names = {
   'Moskva': 'Москва',
   'Moscow': 'Москва',
   'СПБ': 'Санкт-Петербург',
   'Saint Petersburg': 'Санкт-Петербург',
   'St Petersburg': 'Санкт-Петербург',
   'Nizhnij Novgorod': 'Нижний Новгород',
   'Tula': 'Тула',
   'Nizhniy Novgorod': 'Нижний Новгород',
}

Top Breweries Table

This table displays top 10 Russian breweries and their position change according to the rating. Simply put, we need to compare data for two periods, that’s [30 days ago; today] and [60 days ago; 30 days ago]. With this in mind, we will need the following headers: ranking, brewery name, position change, and number of check-ins.
Create the get_top_russian_breweries function that would make queries to the Clickhouse DB, sort the data and return a refined Pandas DataFrame. Let’s send the following queries to obtain data for the past 30 and 60 days, ordering the results by the number of check-ins.

Querying data from the Database

def get_top_russian_breweries(checkins_n=250):
   top_n_brewery_today = client.execute(f'''
      SELECT  rt.brewery_id,
              rt.brewery_name,
              beer_pure_average_mult_count/count_for_that_brewery as avg_rating,
              count_for_that_brewery as checkins FROM (
      SELECT           
              brewery_id,
              dictGet('breweries', 'brewery_name', toUInt64(brewery_id)) as brewery_name,
              sum(rating_score) AS beer_pure_average_mult_count,
              count(rating_score) AS count_for_that_brewery
          FROM beer_reviews t1
          ANY LEFT JOIN venues AS t2 ON t1.venue_id = t2.venue_id
          WHERE isNotNull(venue_id) AND (created_at >= (today() - 30)) AND (venue_country = 'Россия') 
          GROUP BY           
              brewery_id,
              brewery_name) rt
      WHERE (checkins>={checkins_n})
      ORDER BY avg_rating DESC
      LIMIT 10
      '''
   )

top_n_brewery_n_days = client.execute(f'''
  SELECT  rt.brewery_id,
          rt.brewery_name,
          beer_pure_average_mult_count/count_for_that_brewery as avg_rating,
          count_for_that_brewery as checkins FROM (
  SELECT           
          brewery_id,
          dictGet('breweries', 'brewery_name', toUInt64(brewery_id)) as brewery_name,
          sum(rating_score) AS beer_pure_average_mult_count,
          count(rating_score) AS count_for_that_brewery
      FROM beer_reviews t1
      ANY LEFT JOIN venues AS t2 ON t1.venue_id = t2.venue_id
      WHERE isNotNull(venue_id) AND (created_at >= (today() - 60) AND created_at <= (today() - 30)) AND (venue_country = 'Россия')
      GROUP BY           
          brewery_id,
          brewery_name) rt
  WHERE (checkins>={checkins_n})
  ORDER BY avg_rating DESC
  LIMIT 10
  '''
)

Creating two DataFrames with the received data:

top_n = len(top_n_brewery_today)
column_names = ['brewery_id', 'brewery_name', 'avg_rating', 'checkins']

top_n_brewery_today_df = pd.DataFrame(top_n_brewery_today, columns=column_names).replace(np.nan, 0)
top_n_brewery_today_df['brewery_pure_average'] = round(top_n_brewery_today_df.avg_rating, 2)
top_n_brewery_today_df['brewery_rank'] = list(range(1, top_n + 1))

top_n_brewery_n_days = pd.DataFrame(top_n_brewery_n_days, columns=column_names).replace(np.nan, 0)
top_n_brewery_n_days['brewery_pure_average'] = round(top_n_brewery_n_days.avg_rating, 2)
top_n_brewery_n_days['brewery_rank'] = list(range(1, len(top_n_brewery_n_days) + 1))

And then calculate the position change over the period of time for each brewery received. With the try-except block, we will handle exceptions, in case, if a brewery was not yet in our database 60 days ago.

rank_was_list = []
for brewery_id in top_n_brewery_today_df.brewery_id:
   try:
       rank_was_list.append(
           top_n_brewery_n_days[top_n_brewery_n_days.brewery_id == brewery_id].brewery_rank.item())
   except ValueError:
       rank_was_list.append('–')
top_n_brewery_today_df['rank_was'] = rank_was_list

Now we iterate over the columns with current and former positions. If there is no hyphen contained in, we will append an up or down arrow depending on the change.

diff_rank_list = []
for rank_was, rank_now in zip(top_n_brewery_today_df['rank_was'], top_n_brewery_today_df['brewery_rank']):
   if rank_was != '–':
       difference = rank_was - rank_now
       if difference > 0:
           diff_rank_list.append(f'↑ +{difference}')
       elif difference < 0:
           diff_rank_list.append(f'↓ {difference}')
       else:
           diff_rank_list.append('–')
   else:
       diff_rank_list.append(rank_was)

Finally, replace DataFrame headers, inserting the column with current ranking positions, where the top 3 will be displayed with the trophy emoji.

df = top_n_brewery_today_df[['brewery_name', 'avg_rating', 'checkins']].round(2)
df.insert(2, 'Position change', diff_rank_list)
df.columns = ['NAME', 'RATING', 'POSITION CHANGE', 'CHECK-INS']
df.insert(0, 'RANKING', list('🏆 ' + str(i) if i in [1, 2, 3] else str(i) for i in range(1, len(df) + 1)))

return df

Filtering data by city name

One of the main tasks we set before creating this dashboard was to find out what are the most liked breweries in a certain city. The user chooses a city in the dropdown menu and gets the results. Sound pretty simple, but is it that easy?
Our next step is to write a script that would update data for each city and store it in separate CSV files. As we mentioned earlier, the city names are not equally structured, so we need to use Google Translator within the if-else block, and since it may not convert some names to Cyrillic we need to explicitly specify such cases:

en_city = venue_city
if en_city == 'Nizhnij Novgorod':
      ru_city = 'Нижний Новгород'
elif en_city == 'Perm':
      ru_city = 'Пермь'
elif en_city == 'Sergiev Posad':
      ru_city = 'Сергиев Посад'
elif en_city == 'Vladimir':
      ru_city = 'Владимир'
elif en_city == 'Yaroslavl':
      ru_city = 'Ярославль'
else:
      ru_city = translator.translate(en_city, dest='ru').text

Then we need to add both city names in English and Russian to the SQL query, to receive all check-ins sent from this city.

WHERE (rt.venue_city='{ru_city}' OR rt.venue_city='{en_city}')

Finally, we export received data into a CSV file in the following directory – data/cities.

df = top_n_brewery_today_df[['brewery_name', 'venue_city', 'avg_rating', 'checkins']].round(2)
df.insert(3, 'Position Change', diff_rank_list)
df.columns = ['NAME', 'CITY', 'RATING', 'POSITION CHANGE', 'CHECK-INS']
# MAPPING
df['CITY'] = df['CITY'].map(lambda x: city_names[x] if (x in city_names) else x)
# TRANSLATING
df['CITY'] = df['CITY'].map(lambda x: translator.translate(x, dest='en').text)
df.to_csv(f'data/cities/{en_city}.csv', index=False)
print(f'{en_city}.csv updated!')

Scheduling Updates

We will use the apscheduler library to automatically run the script and refresh data for each city in all_cities every day at 10:30 am (UTC).

from apscheduler.schedulers.background import BackgroundScheduler
from get_tables import update_best_breweries

all_cities = sorted(['Vladimir', 'Voronezh', 'Ekaterinburg', 'Kazan', 'Red Pakhra', 'Krasnodar',
             'Kursk', 'Moscow', 'Nizhnij Novgorod', 'Perm', 'Rostov-on-Don', 'Saint Petersburg',
             'Sergiev Posad', 'Tula', 'Yaroslavl'])

scheduler = BackgroundScheduler()
@scheduler.scheduled_job('cron', hour=10, misfire_grace_time=30)
def update_data():
   for city in all_cities:
       update_best_breweries(city)
scheduler.start()

Table from DataFrame

get_top_russian_breweries_table(venue_city, checkins_n=250) will accept venue_city and checkins_n generating a Bootstrap Table with the top breweries. The second parameter value, checkins_n can be changed with the slider. If the city name is not specified, the function will return top Russian breweries table.

if venue_city == None: 
      selected_df = get_top_russian_breweries(checkins_n)
else: 
      en_city = venue_city

In other case the DataFrame will be constructed from a CSV file stored in data/cities/. Since the city column still may contain different names we should apply mapping and use a lambda expression with the map() method. The lambda function will compare values in the column against keys in city_names and if there is a match, the column value will be overwritten.
For instance, if df[‘CITY’] contains “СПБ”, a frequent acronym for Saint Petersburg, the value will be replaced, while for “Воронеж” it will remain unchanged.
And last but not least, we need to remove all duplicate rows from the table, add a column with a ranking position and return the first 10 rows. These would be the most liked breweries in a selected city.

df = pd.read_csv(f'data/cities/{en_city}.csv')     
df = df.loc[df['CHECK-INS'] >= checkins_n]
df.drop_duplicates(subset=['NAME', 'CITY'], keep='first', inplace=True)  
df.insert(0, 'RANKING', list('🏆 ' + str(i) if i in [1, 2, 3] else str(i) for i in range(1, len(df) + 1)))
selected_df = df.head(10)

After all DataFrame manipulations, the function returns a simply styled Bootstrap table of top breweries.

Bootstrap table layout in DBC

table = dbc.Table.from_dataframe(selected_df, striped=False,
                                bordered=False, hover=True,
                                size='sm',
                                style={'background-color': '#ffffff',
                                       'font-family': 'Proxima Nova Regular',
                                       'text-align':'center',
                                       'fontSize': '12px'},
                                className='table borderless'
                                )

return table

Layout structure

Add a Slider and a Dropdown menu with city names in application.py

To learn more about the Dashboard layout structure, please refer to our previous guide

checkins_slider_tab_1 = dbc.CardBody(
                           dbc.FormGroup(
                               [
                                   html.H6('Number of check-ins', style={'text-align': 'center'})),
                                   dcc.Slider(
                                       id='checkin_n_tab_1',
                                       min=0,
                                       max=250,
                                       step=25,
                                       value=250,  
                                       loading_state={'is_loading': True},
                                       marks={i: i for i in list(range(0, 251, 25))}
                                   ),
                               ],
                           ),
                           style={'max-height': '80px', 
                                  'padding-top': '25px'
                                  }
                       )

top_breweries = dbc.Card(
       [
           dbc.CardBody(
               [
                   dbc.FormGroup(
                       [
                           html.H6('Filter by city', style={'text-align': 'center'}),
                           dcc.Dropdown(
                               id='city_menu',
                               options=[{'label': i, 'value': i} for i in all_cities],
                               multi=False,
                               placeholder='Select city',
                               style={'font-family': 'Proxima Nova Regular'}
                           ),
                       ],
                   ),
                   html.P(id="tab-1-content", className="card-text"),
               ],
           ),
   ],
)

We’ll also need to add a callback function to update the table by dropdown menu and slider values:

@app.callback(
   Output("tab-1-content", "children"), [Input("city_menu", "value"),
                                         Input("checkin_n_tab_1", "value")]
)
def table_content(city, checkin_n):
   return get_top_russian_breweries_table(city, checkin_n)

Tada, the main table is ready! The dashboard can be used to receive up-to-date info about best Russian breweries, beers, and its rating across different regions, and help to make a better choice for an enjoyable tasting experience.

View the code on GitHub

No comments 626 2020 BI-tools bootstrap dash plotly python

How to build a dashboard with Bootstrap 4 from scratch (Part 1)

⏱ Время чтения текста – 13 минут

In previous articles we reviewed Plotly’s Dash Framework, learned to build scatter plots and create a map visualization. This time we will summarize our knowledge and put all the pieces together to design a dashboard layout using the Bootstrap 4 grid system.
To facilitate the development, we’ll refer to the dash-bootstrap-components library. This is a great tool that integrates Bootstrap in Dash, allowing us to write web pages in pure Python, and add any Bootstrap components and styling.

Draft Layout

Before we begin coding it’s crucial to have a plan of our app, a rough layout that would help us to see the big picture and quickly modify the structure. We used draw.io to make a dashboard draft, this application enables to create diagrams, graphs, flowcharts, and forms at the click of a button. The dashboard will be built according to this template:

Like the dashboard itself, the top header will be colored in gold and white, the main colors of Untappd. Just below the header, there is a section with breweries, which includes a scatter plot and a control panel. And at the bottom of the page, there will be a map showing beverage rating across the regions of Russia.

All right, let’s get started, first create a new python file with the name application.py. The file will store all the front end components of the dashboard, and create a new directory named assets. The directory structure should be similar:

- application.py
- assets/
    |-- typography.css
    |-- header.css
    |-- custom-script.js
    |-- image.png

Then we import the libraries and initialize our application:

import dash
import dash_bootstrap_components as dbc
import dash_html_components as html
import dash_core_components as dcc
import pandas as pd
from get_ratio_scatter_plot import get_plot
from get_russian_map import get_map
from clickhouse_driver import Client
from dash.dependencies import Input, Output

standard_BS = dbc.themes.BOOTSTRAP
app = dash.Dash(__name__, external_stylesheets=[standard_BS])

Main parameters of the app:
__name__ — to enable access to static elements stored in the assets folder (such as images, CSS and JS files)
external_stylesheets — external CSS styling, here we are using a standard Bootstrap theme, however you can create your own theme or use any of the availables ones.

Hook up a few more things to work with local files and connect to the Clickhouse Database:

app.scripts.config.serve_locally = True
app.css.config.serve_locally = True

client = Client(host='ec2-3-16-148-63.us-east-2.compute.amazonaws.com',
                user='default',
                password='',
                port='9000',
                database='default')

Add a palette of colors:

colors = ['#ffcc00', 
          '#f5f2e8', 
          '#f8f3e3',
          '#ffffff', 
          ]

Creating a layout

All the dashboard elements will be placed within a Bootstrap container, which is in the <div> block:

- app 
    |-- div
     |-- container
      |-- logo&header
     |-- container
      |-- div
       |-- controls&scatter
       |-- map

app.layout = html.Div(
                    [
                        dbc.Container(

                                         < header>
                         
                        dbc.Container(       
                            html.Div(
                                [
                        
                                    < body >
                        
                                ],
                            ),
                            fluid=False, style={'max-width': '1300px'},
                        ),
                    ],
                    style={'background-color': colors[1], 'font-family': 'Proxima Nova Bold'},
                )

Here we set a fixed container width, background color, and font style of the page that is stored in typography.css in the assets folder. Let’s take a closer look at the first element in the div block, that’s the top header with the Untappd logo:

logo = html.Img(src=app.get_asset_url('logo.png'),
                        style={'width': "128px", 'height': "128px",
                        }, className='inline-image')

and the header:

header = html.H3("Russian breweries stats from Untappd", style={'text-transform': "uppercase"})

We used Bootstrap Forms to position these two elements on the same level.

logo_and_header = dbc.FormGroup(
        [
            logo,
            html.Div(
                [
                    header
                ],
                className="p-5"
            )
        ],
        className='form-row',
)

The class name ‘p-5’ allows to increase padding and vertically align the title while specifying ‘form-row’ as the form class name we put the logo and header in one row. At this point, the top header should look the following:

Now we need to center the elements and add some colors. Create a separate container that will take one row. Specify ‘d-flex justify-content-center’ in the className to achieve the same output.

dbc.Container(
                    dbc.Row(
                        [
                            dbc.Col(
                                html.Div(
                                    logo_and_header,
                                ),
                            ),
                        ],
                        style={'max-height': '128px',
                               'color': 'white',
                       }

                    ),
                    className='d-flex justify-content-center',
                    style={'max-width': '100%',
                           'background-color': colors[0]},
                ),

And now the top header is done:

We’re approaching the main part, create the next Bootstrap Container and add a subheading:

dbc.Container(
                    html.Div(
                        [
                            html.Br(),
                            html.H5("Breweries", style={'text-align':'center', 'text-transform': 'uppercase'}),
                            html.Hr(), # horizontal  break

The main body will consist of Bootstrap Cards, they can provide a structured layout of all parts, giving each element a clear border and saving the white space. Create the next element, a control panel with sliders:

slider_day_values = [1, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
slider_top_breweries_values = [5, 25, 50, 75, 100, 125, 150, 175, 200]

controls = dbc.Card(
    [
       dbc.CardBody(
           [
               dbc.FormGroup(
                    [
                        dbc.Label("Time Period", style={'text-align': 'center', 'font-size': '100%', 'text-transform': 'uppercase'}),
                        dcc.Slider(
                            id='slider-day',
                            min=1,
                            max=100,
                            step=10,
                            value=100,
                            marks={i: i for i in slider_day_values}
                        ),
                    ], style={'text-align': 'center'}
               ),
               dbc.FormGroup(
                    [
                        dbc.Label("Number of breweries", style={'text-align': 'center', 'font-size': '100%', 'text-transform': 'uppercase'}),
                        dcc.Slider(
                            id='slider-top-breweries',
                            min=5,
                            max=200,
                            step=5,
                            value=200,
                            marks={i: i for i in slider_top_breweries_values}
                        ),
                    ], style={'text-align': 'center'}
               ),
           ],
       )
    ],
    style={'height': '32.7rem', 'background-color': colors[3]}
)

The control panel consists of two sliders that can be used to change the view on the scatter, they are positioned one below the other in a Bootstrap Form. The sliders were put inside the dbc.CardBody block, other elements will be added in the same way. It allows to eliminate alignment problem and achieve clear borders. By default, the sliders are painted in blue, but we can easily customize them by changing the properties of the class in sliders.css. Add the control panel with the scatter plot as follows:

dbc.Row(
                [
                    dbc.Col(controls, width={"size": 4,
                                     "order": 'first',
                                             "offset": 0},
                     ),
                     dbc.Col(dbc.Card(
                                [
                                    dbc.CardBody(
                                        [
                                            html.H6("The ratio between the number of reviews and the average brewery rating",
                                                    className="card-title",
                                                    style={'text-transform': 'uppercase'}), 
                                            dcc.Graph(id='ratio-scatter-plot'),
                                        ],
                                    ),
                                ],
                                style={'background-color': colors[2], 'text-align':'center'}
                             ),
                     md=8),
                ],
                align="start",
                justify='center',
            ),
html.Br(),

And at the bottom of the page we will position the map:

html.H5("Venues and Regions", style={'text-align':'center', 'text-transform': 'uppercase',}),
                            html.Hr(), # horizontal  break
                            dbc.Row(
                                [
                                    dbc.Col(
                                        dbc.Card(
                                            [
                                                dbc.CardBody(
                                                    [
                                                        html.H6("Average beer rating across regions",
                                                                className="card-title",
                                                                style={'text-transform': 'uppercase'},
                                                        ),  
                                                        dcc.Graph(figure=get_map())
                                                    ],
                                                ),
                                            ],
                                        style={'background-color': colors[2], 'text-align': 'center'}
                                        ),
                                md=12),
                                ]
                            ),
                            html.Br(),

Callbacks in Dash

Callback functions allow making dashboard elements interactive through the Input and Output properties of a particular component.

@app.callback(
    Output('ratio-scatter-plot', 'figure'),
    [Input('slider-day', 'value'),
     Input('slider-top-breweries', 'value'),
     ]
)
def get_scatter_plots(n_days=100, top_n=200):
    if n_days == 100 and top_n == 200:
        df = pd.read_csv('data/ratio_scatter_plot.csv')
        return get_plot(n_days, top_n, df)
    else:
        return get_plot(n_days, top_n)

In this example, our inputs are the “value” properties of the components that have the ids “slider-day’” and “slider-top-breweries”. Our output is the “children” property of the component with the id “ratio-scatter-plot”. When the input values are changed, the decorator function will be called automatically and the output on the scatter is updated. Learn more about callbacks from the examples in the docs.
It’s worth noting, that the scatter plot may not be displayed correctly when the page is loaded. To avoid this scenario we need to specify its initial state and produce a scatter plot from the saved CSV file stored in the data folder. Then, when changing the slider values, all data will be taken directly from the Clickhouse tables.

Add a few more lines responsible for deployment and our app is ready to run:

application = app.server

if __name__ == '__main__':
    application.run(debug=True, port=8000)

Next, we need to deploy our app to AWS BeansTalk and the first part of our Bootstrap Dashboard is completed:

Thanks for reading the first part of our series about Bootstrap Dashboards, in the next one we are going to add more new components, improved callbacks, and talk about tables in Bootstrap.

View the code on Github

No comments 1325 2020 bootstrap dash data analytics untappd

Your password

LEFT JOIN: blog on analytics, visualisation & data science