<?xml version="1.0" encoding="utf-8"?> 
<rss version="2.0">

<channel>

<title>LEFT JOIN: blog on analytics, visualisation &amp; data science, posts tagged: dash</title>
<link>https://en.leftjoin.ru/tags/dash/</link>
<description></description>
<generator>E2 (v3386; Aegea)</generator>

<item>
<title>How to build a dashboard with Bootstrap 4 from scratch (Part 2)</title>
<guid isPermaLink="false">47</guid>
<link>https://en.leftjoin.ru/all/how-to-build-a-dashboard-with-bootstrap-4-from-scratch-part-2/</link>
<comments>https://en.leftjoin.ru/all/how-to-build-a-dashboard-with-bootstrap-4-from-scratch-part-2/</comments>
<description>
&lt;div class="e2-text-picture"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/1-18.png" width="2000" height="1154" alt="" /&gt;
&lt;/div&gt;
&lt;p&gt;Previously we shared &lt;a href="https://en.leftjoin.ru/all/how-to-build-dashboard-with-bootstrap-4-from-scratch-part-1/"&gt;how to use Bootstrap components in building dashboard layout&lt;/a&gt; and designed a simple yet flexible dashboard with a scatter plot and Russian map. In today’s material, we will continue adding more information, explore how to make Bootstrap tables responsive, and cover some complex callbacks for data acquisition.&lt;/p&gt;
&lt;h2&gt;Constructing Data Tables&lt;/h2&gt;
&lt;p&gt;All the code for populating our tables with data will be stored in &lt;span class="inline-code"&gt;get_tables.py&lt;/span&gt; , while the layout components areoutlined in &lt;span class="inline-code"&gt; application.py&lt;/span&gt;.  This article will cover the process of creating the table with top Russian Breweries,  however, you can find the code for creating the other three on Github.&lt;/p&gt;
&lt;p&gt;Data in the Top Breweries table can be filtered by city name in the dropdown menu, but the data collected in Untappd is not equally structured. Some city names are written in Latin, others in Cyrillic. So the challenge is to make the names equal for SQL queries, and here is where Google Translate comes to the rescue. Though we sill have to manually create a dictionary of city names,  since for example “Москва” can be written as “Moskva”  and not “Moscow”.  This dictionary will be used later for mapping our DataFrame before transforming it into a Bootstrap table.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;import pandas as pd
import dash_bootstrap_components as dbc
from clickhouse_driver import Client
import numpy as np
from googletrans import Translator

translator = Translator()

client = Client(host='12.34.56.78', user='default', password='', port='9000', database='')

city_names = {
   'Moskva': 'Москва',
   'Moscow': 'Москва',
   'СПБ': 'Санкт-Петербург',
   'Saint Petersburg': 'Санкт-Петербург',
   'St Petersburg': 'Санкт-Петербург',
   'Nizhnij Novgorod': 'Нижний Новгород',
   'Tula': 'Тула',
   'Nizhniy Novgorod': 'Нижний Новгород',
}&lt;/code&gt;&lt;/pre&gt;&lt;h2&gt;Top Breweries Table&lt;/h2&gt;
&lt;p&gt;This table displays top 10 Russian breweries and their position change according to the rating. Simply put, we need to compare data for two periods, that’s [30 days ago; today] and [60 days ago; 30 days ago]. With this in mind, we will need the following headers: ranking, brewery name, position change, and number of check-ins.&lt;br /&gt;
Create the  &lt;span class="inline-code"&gt;get_top_russian_breweries&lt;/span&gt; function that would make queries to the Clickhouse DB, sort the data and return a refined Pandas DataFrame. Let’s send the following queries to obtain data for the past 30 and 60 days, ordering the results by the number of check-ins.&lt;/p&gt;
&lt;p&gt;&lt;details&gt;&lt;br /&gt;
&lt;summary&gt;&lt;span style="color:#7ea9b8"&gt;Querying data from the Database&lt;/span&gt;&lt;/summary&gt;&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;def get_top_russian_breweries(checkins_n=250):
   top_n_brewery_today = client.execute(f'''
      SELECT  rt.brewery_id,
              rt.brewery_name,
              beer_pure_average_mult_count/count_for_that_brewery as avg_rating,
              count_for_that_brewery as checkins FROM (
      SELECT           
              brewery_id,
              dictGet('breweries', 'brewery_name', toUInt64(brewery_id)) as brewery_name,
              sum(rating_score) AS beer_pure_average_mult_count,
              count(rating_score) AS count_for_that_brewery
          FROM beer_reviews t1
          ANY LEFT JOIN venues AS t2 ON t1.venue_id = t2.venue_id
          WHERE isNotNull(venue_id) AND (created_at &amp;gt;= (today() - 30)) AND (venue_country = 'Россия') 
          GROUP BY           
              brewery_id,
              brewery_name) rt
      WHERE (checkins&amp;gt;={checkins_n})
      ORDER BY avg_rating DESC
      LIMIT 10
      '''
   )

top_n_brewery_n_days = client.execute(f'''
  SELECT  rt.brewery_id,
          rt.brewery_name,
          beer_pure_average_mult_count/count_for_that_brewery as avg_rating,
          count_for_that_brewery as checkins FROM (
  SELECT           
          brewery_id,
          dictGet('breweries', 'brewery_name', toUInt64(brewery_id)) as brewery_name,
          sum(rating_score) AS beer_pure_average_mult_count,
          count(rating_score) AS count_for_that_brewery
      FROM beer_reviews t1
      ANY LEFT JOIN venues AS t2 ON t1.venue_id = t2.venue_id
      WHERE isNotNull(venue_id) AND (created_at &amp;gt;= (today() - 60) AND created_at &amp;lt;= (today() - 30)) AND (venue_country = 'Россия')
      GROUP BY           
          brewery_id,
          brewery_name) rt
  WHERE (checkins&amp;gt;={checkins_n})
  ORDER BY avg_rating DESC
  LIMIT 10
  '''
)&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;&lt;/details&gt;&lt;/p&gt;
&lt;p&gt;Creating two DataFrames with the received data:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;top_n = len(top_n_brewery_today)
column_names = ['brewery_id', 'brewery_name', 'avg_rating', 'checkins']

top_n_brewery_today_df = pd.DataFrame(top_n_brewery_today, columns=column_names).replace(np.nan, 0)
top_n_brewery_today_df['brewery_pure_average'] = round(top_n_brewery_today_df.avg_rating, 2)
top_n_brewery_today_df['brewery_rank'] = list(range(1, top_n + 1))

top_n_brewery_n_days = pd.DataFrame(top_n_brewery_n_days, columns=column_names).replace(np.nan, 0)
top_n_brewery_n_days['brewery_pure_average'] = round(top_n_brewery_n_days.avg_rating, 2)
top_n_brewery_n_days['brewery_rank'] = list(range(1, len(top_n_brewery_n_days) + 1))&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;And then calculate the position change over the period of time for each brewery received. With the try-except block, we will handle exceptions, in case, if a brewery was not yet in our database 60 days ago.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;rank_was_list = []
for brewery_id in top_n_brewery_today_df.brewery_id:
   try:
       rank_was_list.append(
           top_n_brewery_n_days[top_n_brewery_n_days.brewery_id == brewery_id].brewery_rank.item())
   except ValueError:
       rank_was_list.append('–')
top_n_brewery_today_df['rank_was'] = rank_was_list&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Now we iterate over the columns with current and former positions. If there is no hyphen contained in, we will append an up or down arrow depending on the change.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;diff_rank_list = []
for rank_was, rank_now in zip(top_n_brewery_today_df['rank_was'], top_n_brewery_today_df['brewery_rank']):
   if rank_was != '–':
       difference = rank_was - rank_now
       if difference &amp;gt; 0:
           diff_rank_list.append(f'↑ +{difference}')
       elif difference &amp;lt; 0:
           diff_rank_list.append(f'↓ {difference}')
       else:
           diff_rank_list.append('–')
   else:
       diff_rank_list.append(rank_was)&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Finally,  replace DataFrame headers, inserting the column with current ranking positions, where the top 3 will be displayed with the trophy emoji.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;df = top_n_brewery_today_df[['brewery_name', 'avg_rating', 'checkins']].round(2)
df.insert(2, 'Position change', diff_rank_list)
df.columns = ['NAME', 'RATING', 'POSITION CHANGE', 'CHECK-INS']
df.insert(0, 'RANKING', list('🏆 ' + str(i) if i in [1, 2, 3] else str(i) for i in range(1, len(df) + 1)))

return df&lt;/code&gt;&lt;/pre&gt;&lt;h2&gt;Filtering data by city name&lt;/h2&gt;
&lt;p&gt;One of the main tasks we set before creating this dashboard was to find out what are the most liked breweries in a certain city. The user chooses a city in the dropdown menu and gets the results. Sound pretty simple, but is it that easy?&lt;br /&gt;
Our next step is to write a script that would update data for each city and store it in separate CSV files. As we mentioned earlier, the city names are not equally structured, so we need to use Google Translator within the if-else block, and since it may not convert some names to Cyrillic we need to explicitly specify such cases:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;en_city = venue_city
if en_city == 'Nizhnij Novgorod':
      ru_city = 'Нижний Новгород'
elif en_city == 'Perm':
      ru_city = 'Пермь'
elif en_city == 'Sergiev Posad':
      ru_city = 'Сергиев Посад'
elif en_city == 'Vladimir':
      ru_city = 'Владимир'
elif en_city == 'Yaroslavl':
      ru_city = 'Ярославль'
else:
      ru_city = translator.translate(en_city, dest='ru').text&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Then we need to add both city names in English and Russian to the SQL query, to receive all check-ins sent from this city.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;WHERE (rt.venue_city='{ru_city}' OR rt.venue_city='{en_city}')&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Finally, we export received data into a CSV file in the following directory –  &lt;span class="inline-code"&gt;data/cities&lt;/span&gt;.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;df = top_n_brewery_today_df[['brewery_name', 'venue_city', 'avg_rating', 'checkins']].round(2)
df.insert(3, 'Position Change', diff_rank_list)
df.columns = ['NAME', 'CITY', 'RATING', 'POSITION CHANGE', 'CHECK-INS']
# MAPPING
df['CITY'] = df['CITY'].map(lambda x: city_names[x] if (x in city_names) else x)
# TRANSLATING
df['CITY'] = df['CITY'].map(lambda x: translator.translate(x, dest='en').text)
df.to_csv(f'data/cities/{en_city}.csv', index=False)
print(f'{en_city}.csv updated!')&lt;/code&gt;&lt;/pre&gt;&lt;h2&gt;Scheduling Updates&lt;/h2&gt;
&lt;p&gt;We will use the &lt;span class="inline-code"&gt;apscheduler&lt;/span&gt;  library to automatically run the script and refresh data for each city in &lt;span class="inline-code"&gt;all_cities&lt;/span&gt; every day at 10:30 am (UTC).&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;from apscheduler.schedulers.background import BackgroundScheduler
from get_tables import update_best_breweries

all_cities = sorted(['Vladimir', 'Voronezh', 'Ekaterinburg', 'Kazan', 'Red Pakhra', 'Krasnodar',
             'Kursk', 'Moscow', 'Nizhnij Novgorod', 'Perm', 'Rostov-on-Don', 'Saint Petersburg',
             'Sergiev Posad', 'Tula', 'Yaroslavl'])

scheduler = BackgroundScheduler()
@scheduler.scheduled_job('cron', hour=10, misfire_grace_time=30)
def update_data():
   for city in all_cities:
       update_best_breweries(city)
scheduler.start()&lt;/code&gt;&lt;/pre&gt;&lt;h2&gt;Table from DataFrame&lt;/h2&gt;
&lt;p&gt;&lt;span class="inline-code"&gt;get_top_russian_breweries_table(venue_city, checkins_n=250)&lt;/span&gt;  will accept venue_city and checkins_n generating a Bootstrap Table with the top breweries. The second parameter value,  &lt;span class="inline-code"&gt;checkins_n&lt;/span&gt;  can be changed with the slider. If the city name is not specified, the function will return top Russian breweries table.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;if venue_city == None: 
      selected_df = get_top_russian_breweries(checkins_n)
else: 
      en_city = venue_city&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;In other case the DataFrame will be constructed from a CSV file stored in &lt;span class="inline-code"&gt;data/cities/&lt;/span&gt;. Since the city column still may contain different names we should apply mapping and use a lambda expression with the &lt;span class="inline-code"&gt;map()&lt;/span&gt; method. The lambda function will compare values in the column against keys in &lt;span class="inline-code"&gt;city_names&lt;/span&gt; and if there is a match, the column value will be overwritten.&lt;br /&gt;
For instance,  if &lt;span class="inline-code"&gt;df[‘CITY’]&lt;/span&gt; contains  “СПБ”, a frequent acronym for Saint Petersburg, the value will be replaced, while for “Воронеж” it will remain unchanged.&lt;br /&gt;
And last but not least, we need to remove all duplicate rows from the table, add a column with a ranking position and return the first 10 rows. These would be the most liked breweries in a selected city.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;df = pd.read_csv(f'data/cities/{en_city}.csv')     
df = df.loc[df['CHECK-INS'] &amp;gt;= checkins_n]
df.drop_duplicates(subset=['NAME', 'CITY'], keep='first', inplace=True)  
df.insert(0, 'RANKING', list('🏆 ' + str(i) if i in [1, 2, 3] else str(i) for i in range(1, len(df) + 1)))
selected_df = df.head(10)&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;After all DataFrame manipulations, the function returns a simply styled Bootstrap table of top breweries.&lt;/p&gt;
&lt;p&gt;&lt;details&gt;&lt;br /&gt;
&lt;summary&gt;&lt;span style="color:#7ea9b8"&gt;Bootstrap table layout in DBC&lt;/span&gt;&lt;/summary&gt;&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;table = dbc.Table.from_dataframe(selected_df, striped=False,
                                bordered=False, hover=True,
                                size='sm',
                                style={'background-color': '#ffffff',
                                       'font-family': 'Proxima Nova Regular',
                                       'text-align':'center',
                                       'fontSize': '12px'},
                                className='table borderless'
                                )

return table&lt;/code&gt;&lt;/pre&gt;&lt;h2&gt;Layout structure&lt;/h2&gt;
&lt;p&gt;Add a Slider and a Dropdown menu with city names in &lt;span class="inline-code"&gt;application.py&lt;/span&gt;&lt;/p&gt;
&lt;p class="note"&gt;To learn more about the Dashboard layout structure, please refer to &lt;a href="https://en.leftjoin.ru/all/how-to-build-dashboard-with-bootstrap-4-from-scratch-part-1/"&gt;our previous guide&lt;/a&gt;&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;checkins_slider_tab_1 = dbc.CardBody(
                           dbc.FormGroup(
                               [
                                   html.H6('Number of check-ins', style={'text-align': 'center'})),
                                   dcc.Slider(
                                       id='checkin_n_tab_1',
                                       min=0,
                                       max=250,
                                       step=25,
                                       value=250,  
                                       loading_state={'is_loading': True},
                                       marks={i: i for i in list(range(0, 251, 25))}
                                   ),
                               ],
                           ),
                           style={'max-height': '80px', 
                                  'padding-top': '25px'
                                  }
                       )

top_breweries = dbc.Card(
       [
           dbc.CardBody(
               [
                   dbc.FormGroup(
                       [
                           html.H6('Filter by city', style={'text-align': 'center'}),
                           dcc.Dropdown(
                               id='city_menu',
                               options=[{'label': i, 'value': i} for i in all_cities],
                               multi=False,
                               placeholder='Select city',
                               style={'font-family': 'Proxima Nova Regular'}
                           ),
                       ],
                   ),
                   html.P(id=&amp;quot;tab-1-content&amp;quot;, className=&amp;quot;card-text&amp;quot;),
               ],
           ),
   ],
)&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;&lt;/details&gt;&lt;/p&gt;
&lt;p&gt;We’ll also need to add a callback function to update the table by dropdown menu and slider values:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;@app.callback(
   Output(&amp;quot;tab-1-content&amp;quot;, &amp;quot;children&amp;quot;), [Input(&amp;quot;city_menu&amp;quot;, &amp;quot;value&amp;quot;),
                                         Input(&amp;quot;checkin_n_tab_1&amp;quot;, &amp;quot;value&amp;quot;)]
)
def table_content(city, checkin_n):
   return get_top_russian_breweries_table(city, checkin_n)&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Tada, the main table is ready! The dashboard can be used to receive up-to-date info about best Russian breweries, beers, and its rating across different regions, and help to make a better choice for an enjoyable tasting experience.&lt;/p&gt;
&lt;div class="e2-text-picture"&gt;
&lt;a href="http://dashboard-final-en.us-east-2.elasticbeanstalk.com/" class="e2-text-picture-link"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/2-17.png" width="1234" height="630" alt="" /&gt;
&lt;/a&gt;&lt;/div&gt;
&lt;p&gt;&lt;i&gt;View the code on &lt;a href="https://github.com/valiotti/leftjoin/tree/master/rutappd"&gt;GitHub&lt;/a&gt;&lt;/i&gt;&lt;/p&gt;
</description>
<pubDate>Wed, 07 Oct 2020 16:35:17 +0300</pubDate>
</item>

<item>
<title>How to build a dashboard with Bootstrap 4 from scratch (Part 1)</title>
<guid isPermaLink="false">43</guid>
<link>https://en.leftjoin.ru/all/how-to-build-dashboard-with-bootstrap-4-from-scratch-part-1/</link>
<comments>https://en.leftjoin.ru/all/how-to-build-dashboard-with-bootstrap-4-from-scratch-part-1/</comments>
<description>
&lt;p&gt;In previous articles we reviewed  Plotly’s Dash Framework,  &lt;a href="https://en.leftjoin.ru/all/building-a-scatter-plot-for-untappd-breweries/"&gt;learned to build scatter plots&lt;/a&gt; and &lt;a href="https://en.leftjoin.ru/all/visualizing-covid-19-in-russia-with-plotly/"&gt; create a map visualization&lt;/a&gt;. This time we will summarize our knowledge and put all the pieces together to design a dashboard layout using the Bootstrap 4 grid system.&lt;br /&gt;
To facilitate the development, we’ll refer to the &lt;a href="https://dash-bootstrap-components.opensource.faculty.ai/"&gt;dash-bootstrap-components&lt;/a&gt; library. This is a great tool that integrates Bootstrap in Dash, allowing us to write web pages in pure Python, and add any Bootstrap components and styling.&lt;/p&gt;
&lt;h2&gt;Draft Layout&lt;/h2&gt;
&lt;p&gt;Before we begin coding it’s crucial to have a plan of our app, a rough layout that would help us to see the big picture and quickly modify the structure.  We used &lt;a href="https://app.diagrams.net/"&gt;draw.io&lt;/a&gt; to make a dashboard draft, this application enables to create diagrams, graphs,  flowcharts, and forms at the click of a button. The dashboard will be built according to this template:&lt;/p&gt;
&lt;div class="e2-text-picture"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/template_1@2x.png" width="872" height="946" alt="" /&gt;
&lt;/div&gt;
&lt;p&gt;Like the dashboard itself,  the top header will be colored in gold and white, the main colors of &lt;a href="https://untappd.com/"&gt;Untappd&lt;/a&gt;.  Just below the header, there is a section with breweries, which includes a scatter plot and a control panel.  And at the bottom of the page, there will be a map showing beverage rating across the regions of Russia.&lt;/p&gt;
&lt;p&gt;All right, let’s get started, first create a new python file with the name application.py. The file will store all the front end components of the dashboard, and create a new directory named assets. The directory structure should be similar:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;- application.py
- assets/
    |-- typography.css
    |-- header.css
    |-- custom-script.js
    |-- image.png&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Then we import the libraries and initialize our application:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;import dash
import dash_bootstrap_components as dbc
import dash_html_components as html
import dash_core_components as dcc
import pandas as pd
from get_ratio_scatter_plot import get_plot
from get_russian_map import get_map
from clickhouse_driver import Client
from dash.dependencies import Input, Output

standard_BS = dbc.themes.BOOTSTRAP
app = dash.Dash(__name__, external_stylesheets=[standard_BS])&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Main parameters of the app:&lt;br /&gt;
&lt;span class="inline-code"&gt;__name__&lt;/span&gt;  — to enable access to static elements stored in the assets folder (such as images, CSS and JS files)&lt;br /&gt;
&lt;span class="inline-code"&gt;external_stylesheets&lt;/span&gt; — external CSS styling,  here we are using a standard Bootstrap theme, however you can create your own theme  or use any of &lt;a href="https://www.bootstrapcdn.com/bootswatch/"&gt; the availables ones&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Hook up a few more things to work with local files  and connect to the Clickhouse Database:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;app.scripts.config.serve_locally = True
app.css.config.serve_locally = True

client = Client(host='ec2-3-16-148-63.us-east-2.compute.amazonaws.com',
                user='default',
                password='',
                port='9000',
                database='default')&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Add a palette of colors:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;colors = ['#ffcc00', 
          '#f5f2e8', 
          '#f8f3e3',
          '#ffffff', 
          ]&lt;/code&gt;&lt;/pre&gt;&lt;h2&gt;Creating a layout&lt;/h2&gt;
&lt;p&gt;All the dashboard elements will be placed within a Bootstrap container,  which is in the  &amp;lt;div&amp;gt block:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;- app 
    |-- div
     |-- container
      |-- logo&amp;amp;header
     |-- container
      |-- div
       |-- controls&amp;amp;scatter
       |-- map&lt;/code&gt;&lt;/pre&gt;&lt;pre class="e2-text-code"&gt;&lt;code&gt;app.layout = html.Div(
                    [
                        dbc.Container(

                                         &amp;lt; header&amp;gt;
                         
                        dbc.Container(       
                            html.Div(
                                [
                        
                                    &amp;lt; body &amp;gt;
                        
                                ],
                            ),
                            fluid=False, style={'max-width': '1300px'},
                        ),
                    ],
                    style={'background-color': colors[1], 'font-family': 'Proxima Nova Bold'},
                )&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Here we set a fixed container width, background color, and font style of the page that is stored in typography.css in the assets folder. Let’s take a closer look at the first element in the div block,  that’s the top header with the Untappd logo:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;logo = html.Img(src=app.get_asset_url('logo.png'),
                        style={'width': &amp;quot;128px&amp;quot;, 'height': &amp;quot;128px&amp;quot;,
                        }, className='inline-image')&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;and the header:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;header = html.H3(&amp;quot;Russian breweries stats from Untappd&amp;quot;, style={'text-transform': &amp;quot;uppercase&amp;quot;})&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;We used Bootstrap Forms to position these two elements on the same level.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;logo_and_header = dbc.FormGroup(
        [
            logo,
            html.Div(
                [
                    header
                ],
                className=&amp;quot;p-5&amp;quot;
            )
        ],
        className='form-row',
)&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The class name  ‘p-5’ allows to increase padding and vertically align the title while specifying ‘form-row’  as the form class name we put the logo and header in one row.  At this point, the top header should  look the following:&lt;/p&gt;
&lt;div class="e2-text-picture"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/logo_and_header.png" width="2132" height="242" alt="" /&gt;
&lt;/div&gt;
&lt;p&gt;Now we need to center the elements and add some colors.  Create a separate container that will take one row. Specify &lt;span class="inline-code"&gt;‘d-flex justify-content-center’&lt;/span&gt; in the &lt;span class="inline-code"&gt;className&lt;/span&gt;  to achieve the same output.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;dbc.Container(
                    dbc.Row(
                        [
                            dbc.Col(
                                html.Div(
                                    logo_and_header,
                                ),
                            ),
                        ],
                        style={'max-height': '128px',
                               'color': 'white',
                       }

                    ),
                    className='d-flex justify-content-center',
                    style={'max-width': '100%',
                           'background-color': colors[0]},
                ),&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;And now the top header is done:&lt;/p&gt;
&lt;div class="e2-text-picture"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/top-header.png" width="2200" height="245" alt="" /&gt;
&lt;/div&gt;
&lt;p&gt;We’re approaching the main part, create the next Bootstrap Container and add a subheading:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;dbc.Container(
                    html.Div(
                        [
                            html.Br(),
                            html.H5(&amp;quot;Breweries&amp;quot;, style={'text-align':'center', 'text-transform': 'uppercase'}),
                            html.Hr(), # horizontal  break&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The main body will consist of Bootstrap Cards, they can provide a structured layout of all parts,  giving each element a clear border and saving the white space. Create the next element, a control panel with sliders:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;slider_day_values = [1, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]
slider_top_breweries_values = [5, 25, 50, 75, 100, 125, 150, 175, 200]

controls = dbc.Card(
    [
       dbc.CardBody(
           [
               dbc.FormGroup(
                    [
                        dbc.Label(&amp;quot;Time Period&amp;quot;, style={'text-align': 'center', 'font-size': '100%', 'text-transform': 'uppercase'}),
                        dcc.Slider(
                            id='slider-day',
                            min=1,
                            max=100,
                            step=10,
                            value=100,
                            marks={i: i for i in slider_day_values}
                        ),
                    ], style={'text-align': 'center'}
               ),
               dbc.FormGroup(
                    [
                        dbc.Label(&amp;quot;Number of breweries&amp;quot;, style={'text-align': 'center', 'font-size': '100%', 'text-transform': 'uppercase'}),
                        dcc.Slider(
                            id='slider-top-breweries',
                            min=5,
                            max=200,
                            step=5,
                            value=200,
                            marks={i: i for i in slider_top_breweries_values}
                        ),
                    ], style={'text-align': 'center'}
               ),
           ],
       )
    ],
    style={'height': '32.7rem', 'background-color': colors[3]}
)&lt;/code&gt;&lt;/pre&gt;&lt;div class="e2-text-picture"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/2@2x.png" width="291.5" height="149" alt="" /&gt;
&lt;/div&gt;
&lt;p&gt;The control panel consists of two sliders that can be used to change the view on the scatter, they are positioned one below the other in a Bootstrap Form. The sliders were put inside the dbc.CardBody block, other elements will be added in the same way. It allows to eliminate alignment problem and achieve clear borders.  By default, the sliders are painted in blue, but we can easily customize them by changing the properties of the class in sliders.css.  Add the control panel with the scatter plot as follows:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;dbc.Row(
                [
                    dbc.Col(controls, width={&amp;quot;size&amp;quot;: 4,
                                     &amp;quot;order&amp;quot;: 'first',
                                             &amp;quot;offset&amp;quot;: 0},
                     ),
                     dbc.Col(dbc.Card(
                                [
                                    dbc.CardBody(
                                        [
                                            html.H6(&amp;quot;The ratio between the number of reviews and the average brewery rating&amp;quot;,
                                                    className=&amp;quot;card-title&amp;quot;,
                                                    style={'text-transform': 'uppercase'}), 
                                            dcc.Graph(id='ratio-scatter-plot'),
                                        ],
                                    ),
                                ],
                                style={'background-color': colors[2], 'text-align':'center'}
                             ),
                     md=8),
                ],
                align=&amp;quot;start&amp;quot;,
                justify='center',
            ),
html.Br(),&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;And at the bottom of the page we will position the map:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;html.H5(&amp;quot;Venues and Regions&amp;quot;, style={'text-align':'center', 'text-transform': 'uppercase',}),
                            html.Hr(), # horizontal  break
                            dbc.Row(
                                [
                                    dbc.Col(
                                        dbc.Card(
                                            [
                                                dbc.CardBody(
                                                    [
                                                        html.H6(&amp;quot;Average beer rating across regions&amp;quot;,
                                                                className=&amp;quot;card-title&amp;quot;,
                                                                style={'text-transform': 'uppercase'},
                                                        ),  
                                                        dcc.Graph(figure=get_map())
                                                    ],
                                                ),
                                            ],
                                        style={'background-color': colors[2], 'text-align': 'center'}
                                        ),
                                md=12),
                                ]
                            ),
                            html.Br(),&lt;/code&gt;&lt;/pre&gt;&lt;h2&gt;Callbacks in Dash&lt;/h2&gt;
&lt;p&gt;Callback functions allow making dashboard elements interactive through the  Input and Output properties of a particular component.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;@app.callback(
    Output('ratio-scatter-plot', 'figure'),
    [Input('slider-day', 'value'),
     Input('slider-top-breweries', 'value'),
     ]
)
def get_scatter_plots(n_days=100, top_n=200):
    if n_days == 100 and top_n == 200:
        df = pd.read_csv('data/ratio_scatter_plot.csv')
        return get_plot(n_days, top_n, df)
    else:
        return get_plot(n_days, top_n)&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;In this example, our inputs are the “value” properties of the components that have the ids “slider-day’” and  “slider-top-breweries”. Our output is the “children” property of the component with the id “ratio-scatter-plot”. When the input values are changed, the decorator function will be called automatically and the output on the scatter is updated. Learn more about callbacks from &lt;a href="https://dash.plotly.com/basic-callbacks/"&gt;the examples in the docs.&lt;/a&gt;&lt;br /&gt;
It’s worth noting, that the scatter plot may not be displayed correctly when the page is loaded. To avoid this scenario we need to specify its initial state and produce a scatter plot from the saved CSV file stored in the data folder.  Then, when changing the slider values, all data will be taken directly from the Clickhouse tables.&lt;/p&gt;
&lt;div class="e2-text-picture"&gt;
&lt;div class="fotorama" data-width="833" data-ratio="1.595785440613"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/scatter_empty_2@2x.png" width="833" height="522" alt="" /&gt;
&lt;img src="https://en.leftjoin.ru/pictures/scatter_2@2x.png" width="828" height="515" alt="" /&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Add a few more lines responsible for deployment and our app is ready to run:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;application = app.server

if __name__ == '__main__':
    application.run(debug=True, port=8000)&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Next, we need to  &lt;a href="https://en.leftjoin.ru/all/deploying-analytical-web-app-with-aws-elastic-beanstalk/"&gt;deploy our app to AWS BeansTalk&lt;/a&gt; and &lt;a href="http://unappd-part-1-en.us-east-2.elasticbeanstalk.com/"&gt;the first part of our Bootstrap Dashboard is completed&lt;/a&gt;:&lt;/p&gt;
&lt;div class="embed-responsive embed-responsive-4by3" style="min-width:500"&gt;&lt;p&gt;&lt;iframe id="igraph" scrolling="yes" style="border:none;"seamless="seamless" src='http://unappd-part-1-en.us-east-2.elasticbeanstalk.com/' height="1360px" width="1100px"&lt;/p&gt;
&lt;/iframe&gt;&lt;/div&gt;&lt;p&gt;Thanks for reading the first part of our series about Bootstrap Dashboards, in the next one we are going to add more new components, improved callbacks, and talk about tables in Bootstrap.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://github.com/valiotti/leftjoin/tree/master/untappd_dashboard_en%20(part_1)/"&gt; View the code on Github&lt;/a&gt;&lt;/p&gt;
</description>
<pubDate>Wed, 16 Sep 2020 16:21:28 +0300</pubDate>
</item>

<item>
<title>VIsualizing COVID-19 in Russia with Plotly</title>
<guid isPermaLink="false">41</guid>
<link>https://en.leftjoin.ru/all/visualizing-covid-19-in-russia-with-plotly/</link>
<comments>https://en.leftjoin.ru/all/visualizing-covid-19-in-russia-with-plotly/</comments>
<description>
&lt;p&gt;Maps are widely used in data visualization, it’s a great tool to display statistics for certain areas, regions, and cities. Before displaying the map we need to encode each region or any other administrative unit. Choropleth map gets divided into polygons and multipolygons with latitude and longitude coordinates. Plotly has a built-in solution for plotting choropleth map for America and Europe regions, however, Russia is not included yet. So we decided to use an existing GeoJSON file to map administrative regions of Russia and display the latest COVID-19 stats with Plotly.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;from urllib.request import urlopen
import json
import requests
import pandas as pd
from selenium import webdriver
from bs4 import BeautifulSoup as bs
import plotly.graph_objects as go&lt;/code&gt;&lt;/pre&gt;&lt;h2&gt;Modifying GeoJSON&lt;/h2&gt;
&lt;p&gt;First, we need to download a public GeoJSON file with the boundaries for the Federal subjects of Russia. The file already contains some information, such as region names, but it’s still doesn’t fit the required format and missing region identifiers.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;with urlopen('https://raw.githubusercontent.com/codeforamerica/click_that_hood/master/public/data/russia.geojson') as response:
    counties = json.load(response)&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Besides that, there are slight differences in the namings. For example,  Bashkortostan on стопкоронавирус.рф, the site we are going to scrape data from, it’s listed as “The Republic of Bashkortostan”, while in our GeoJSON file it’s simply named “Bashkortostan”. These differences should be eliminated to avoid possible confusion. Also, the names should start with a capital.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;regions_republic_1 = ['Бурятия', 'Тыва', 'Адыгея', 'Татарстан', 'Марий Эл',
                      'Чувашия', 'Северная Осетия – Алания', 'Алтай',
                      'Дагестан', 'Ингушетия', 'Башкортостан']
regions_republic_2 = ['Удмуртская республика', 'Кабардино-Балкарская республика',
                      'Карачаево-Черкесская республика', 'Чеченская республика']
for k in range(len(counties['features'])):
    counties['features'][k]['id'] = k
    if counties['features'][k]['properties']['name'] in regions_republic_1:
        counties['features'][k]['properties']['name'] = 'Республика ' + counties['features'][k]['properties']['name']
    elif counties['features'][k]['properties']['name'] == 'Ханты-Мансийский автономный округ - Югра':
        counties['features'][k]['properties']['name'] = 'Ханты-Мансийский АО'
    elif counties['features'][k]['properties']['name'] in regions_republic_2:
        counties['features'][k]['properties']['name'] = counties['features'][k]['properties']['name'].title()&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;It’s time to create a DataFrame from the resulting GeoJSON file with the regions of Russia, we’ll take the identifiers and names.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;region_id_list = []
regions_list = []
for k in range(len(counties['features'])):
    region_id_list.append(counties['features'][k]['id'])
    regions_list.append(counties['features'][k]['properties']['name'])
df_regions = pd.DataFrame()
df_regions['region_id'] = region_id_list
df_regions['region_name'] = regions_list&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;As a result, our DataFrame looks like the following:&lt;/p&gt;
&lt;div class="e2-text-picture"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/1-13.png" width="313" height="179" alt="" /&gt;
&lt;/div&gt;
&lt;h2&gt;Data Scraping&lt;/h2&gt;
&lt;p&gt;We need to scrape the data stored in this table:&lt;/p&gt;
&lt;div class="e2-text-picture"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/2-13.png" width="1160" height="280" alt="" /&gt;
&lt;/div&gt;
&lt;p&gt;Let’s use the Selenium library for this task. We need to navigate to the webpage and convert it into a BeautifulSoup object&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;driver = webdriver.Chrome()
driver.get('https://стопкоронавирус.рф/information/')
source_data = driver.page_source
soup = bs(source_data, 'lxml')&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The region names are wrapped with &amp;lt;th&amp;gt; tags, while the latest data is stored in table cells, each one is defined with a &amp;lt;td&amp;gt; tag.&lt;/p&gt;
&lt;div class="e2-text-picture"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/3-12.png" width="458" height="120" alt="" /&gt;
&lt;/div&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;divs_data = soup.find_all('td')&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The divs_data  list should return something like this:&lt;/p&gt;
&lt;div class="e2-text-picture"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/4-10.png" width="816" height="312" alt="" /&gt;
&lt;/div&gt;
&lt;p&gt;The data is grouped in one line, this includes both new cases and active ones. It is noticeable that each region corresponds to five values, for Moscow these are the first five, for Moscow Region the next five and so on. We can use this pattern to create five lists and populate with values according to the index. The first value will be appended to the list with active cases, the second value to the list of new ones, etc. After every five values, the index will be reset to zero.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;count = 1
for td in divs_data:
    if count == 1:
        sick_list.append(int(td.text))
    elif count == 2:
        new_list.append(int(td.text))
    elif count == 3:
        cases_list.append(int(td.text))
    elif count == 4:
        healed_list.append(int(td.text))
    elif count == 5:
        died_list.append(int(td.text))
        count = 0
    count += 1&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The next step is to extract the region names from the table, they are stored within the col-region class. We also need to clean up the data by eliminating extra white spaces and line breaks.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;divs_region_names = soup.find_all('th', {'class':'col-region'})
region_names_list = []
for i in range(1, len(divs_region_names)):
    region_name = divs_region_names[i].text
    region_name = region_name.replace('\n', '').replace('  ', '')
    region_names_list.append(region_name)&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Create a DataFrame:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;df = pd.DataFrame()
df['region_name'] = region_names_list
df['sick'] = sick_list
df['new'] = new_list
df['cases'] = cases_list
df['healed'] = healed_list
df['died'] = died_list&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;After reviewing our data once again we detected white space under the index 10. This should be fixed immediately, otherwise, we may run into problems.&lt;/p&gt;
&lt;div class="e2-text-picture"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/5-8.png" width="310" height="128" alt="" /&gt;
&lt;/div&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;df.loc[10, 'region_name'] = df[df.region_name == 'Челябинская область '].region_name.item().strip(' ')&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Finally,  we can merge our DataFrame on the region_name column, so that the resulted table will include a column with region id, which is required to make a choropleth map.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;df = df.merge(df_regions, on='region_name')&lt;/code&gt;&lt;/pre&gt;&lt;h2&gt;Creating a choropleth map with Plotly&lt;/h2&gt;
&lt;p&gt;Let’s create a new figure and pass a choroplethmapbox object to it. The geojson parameter will accept the counties variable with the GeoJSON file, assign the region_id to locations. The z parameter represents the data to be color-coded, in this example we’re passing the number of new cases for each region. Assign the region names to text. The colorscale parameter accepts lists with values ranging from 0 to 1 and RGB color codes. Here, the palette changes from green to yellow and then red, depending on the number of active cases. By passing the values stored in customdata we can change our hovertemplate.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;fig = go.Figure(go.Choroplethmapbox(geojson=counties,
                           locations=df['region_id'],
                           z=df['new'],
                           text=df['region_name'],
                           colorscale=[[0, 'rgb(34, 150, 79)'],
                                       [0.2, 'rgb(249, 247, 174)'],
                                       [0.8, 'rgb(253, 172, 99)'],
                                       [1, 'rgb(212, 50, 44)']],
                           colorbar_thickness=20,
                           customdata=np.stack([df['cases'], df['died'], df['sick'], df['healed']], axis=-1),
                           hovertemplate='&amp;lt;b&amp;gt;%{text}&amp;lt;/b&amp;gt;'+ '&amp;lt;br&amp;gt;' +
                                         'New cases: %{z}' + '&amp;lt;br&amp;gt;' +
                                         'Active cases: %{customdata[0]}' + '&amp;lt;br&amp;gt;' +
                                         'Deaths: %{customdata[1]}' + '&amp;lt;br&amp;gt;' +
                                         'Total cases: %{customdata[2]}' + '&amp;lt;br&amp;gt;' +
                                         'Recovered: %{customdata[3]}' +
                                         '&amp;lt;extra&amp;gt;&amp;lt;/extra&amp;gt;',
                           hoverinfo='text, z'))&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Let’s customize the map, we will use a ready-to-go neutral template, called carto-positron. Set the parameters and display the map:&lt;br /&gt;
mapbox_zoom: responsible for zooming;&lt;br /&gt;
mapbox_center: centers the map;&lt;br /&gt;
marker_line_width: border width (we removed the borders by setting this parameter to 0);&lt;br /&gt;
margin: usually accepts 0 values to make the map wider.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;fig.update_layout(mapbox_style=&amp;quot;carto-positron&amp;quot;,
                  mapbox_zoom=1, mapbox_center = {&amp;quot;lat&amp;quot;: 66, &amp;quot;lon&amp;quot;: 94})
fig.update_traces(marker_line_width=0)
fig.update_layout(margin={&amp;quot;r&amp;quot;:0,&amp;quot;t&amp;quot;:0,&amp;quot;l&amp;quot;:0,&amp;quot;b&amp;quot;:0})
fig.show()&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;And here is our map. According to the plot, we can say that the highest number of cases per day is happening in Moscow – 608 new cases. It’s really high compared to the other regions, and especially to Nenets Autonomous Okrug, where this number is surprisingly low.&lt;/p&gt;
&lt;iframe id="igraph" scrolling="no" style="border:none;"seamless="seamless" src="http://map-env.eba-qra4m2aq.us-east-2.elasticbeanstalk.com/"  height="500" width="800"&gt;&lt;/iframe&gt;
&lt;p&gt;&lt;a href="https://github.com/valiotti/leftjoin/tree/master/plotly_russian_map"&gt;View the code on GitHub&lt;/a&gt;&lt;/p&gt;
</description>
<pubDate>Thu, 13 Aug 2020 09:30:00 +0300</pubDate>
</item>

<item>
<title>Deploying Analytical Web App with AWS Elastic Beanstalk</title>
<guid isPermaLink="false">42</guid>
<link>https://en.leftjoin.ru/all/deploying-analytical-web-app-with-aws-elastic-beanstalk/</link>
<comments>https://en.leftjoin.ru/all/deploying-analytical-web-app-with-aws-elastic-beanstalk/</comments>
<description>
&lt;p&gt;If you need to deploy a web application and there’s an AWS EC2 Instance at hand, why not use Elastic Beanstalk? This is an AWS service that allows us to orchestrate many other ones, including EC2, S3, Simple Notification Service, CloudWatch, etc.&lt;/p&gt;
&lt;h2&gt;Setting things up&lt;/h2&gt;
&lt;p&gt;Previously, in our article &lt;a href="https://www.valiotti.com/leftjoin/all/building-a-plotly-dashboard-with-dynamic-sliders-in-python/"&gt;“Building a Plotly Dashboard with dynamic sliders in Python” &lt;/a&gt; we created a project with two scripts: application.py – creates a dashboard on a local server, and get_plots.py – returns a scatter plot with Untappd breweries from &lt;a href="https://www.valiotti.com/leftjoin/all/building-a-scatter-plot-for-untappd-breweries/" class="nu"&gt;“&lt;u&gt;Building a scatter plot for Untappd Breweries&lt;/u&gt;”&lt;/a&gt;. Let’s modify the application.py script a bit to make it run with Elastic Beanstalk. Assign app.server to the application variable,  it should look something like this:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;application = app.server

if __name__ == '__main__':
   application.run(debug=True, port=8080)&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Before deploying our app we need to create a compressed archive. This archive should contain all the necessary files, including  requirements.txt that specifies what python packages are required to run the project. Just type pip freeze in your terminal window and save the output to a file:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;pip freeze &amp;gt; requirements.txt&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Now we can create a compressed archive. Unix-based systems have a built-in zip command for archiving and compression:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;zip deploy_v0 application.py get_plots.py requirements.txt&lt;/code&gt;&lt;/pre&gt;&lt;h2&gt;Application and Environment&lt;/h2&gt;
&lt;p&gt;Navigate to  &lt;a href="https://us-east-2.console.aws.amazon.com/elasticbeanstalk/home?region=us-east-2"&gt;Elastic Beanstalk&lt;/a&gt;, click the “Applications” section and then “Create a new application”.&lt;/p&gt;
&lt;div class="e2-text-picture"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/1-14.png" width="915" height="170" alt="" /&gt;
&lt;/div&gt;
&lt;p&gt;Fill in the necessary fields by specifying your app name and its description. After this, we are suggested to assign metadata and tag our app. The format of the tag is similar to a dictionary in Python, it’s a key-value pair, where the value of a key is unique. Once you’re ready to continue click the orange “Create’” button.&lt;/p&gt;
&lt;div class="e2-text-picture"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/2-14.png" width="907" height="358" alt="" /&gt;
&lt;/div&gt;
&lt;p&gt;After this step, you will see a list of environments available for your app, which is initially empty. Click “ Create a new environment”&lt;/p&gt;
&lt;div class="e2-text-picture"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/3-13.png" width="912" height="326" alt="" /&gt;
&lt;/div&gt;
&lt;p&gt;Since we are working with a web app, we need to select a web server environment:&lt;/p&gt;
&lt;div class="e2-text-picture"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/4-11.png" width="906" height="413" alt="" /&gt;
&lt;/div&gt;
&lt;p&gt;On the next step we need to specify our environment name and also choose a domain name, if available:&lt;/p&gt;
&lt;div class="e2-text-picture"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/5-9.png" width="897" height="546" alt="" /&gt;
&lt;/div&gt;
&lt;p&gt;Next, we select the platform for our app, which is written in Python:&lt;/p&gt;
&lt;div class="e2-text-picture"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/6-7.png" width="898" height="413" alt="" /&gt;
&lt;/div&gt;
&lt;p&gt;Now we can upload the file with our app, click  “ Upload your code” and attach the compressed file. Afterward, click “Create environment”.&lt;/p&gt;
&lt;div class="e2-text-picture"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/7-6.png" width="893" height="834" alt="" /&gt;
&lt;/div&gt;
&lt;p&gt;You will see a terminal window with event logs. We have a couple of minutes for a coffee break.&lt;/p&gt;
&lt;div class="e2-text-picture"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/8-5.png" width="912" height="326" alt="" /&gt;
&lt;/div&gt;
&lt;p&gt;Now our app is up and running, if you need to upload a new version, just create a new archive with updated files and click the” Upload and deploy” button again. If everything’s done right, you will see something like this:&lt;/p&gt;
&lt;div class="e2-text-picture"&gt;
&lt;img src="https://en.leftjoin.ru/pictures/9-5.png" width="914" height="388" alt="" /&gt;
&lt;/div&gt;
&lt;p&gt;We can switch to the site with our dashboard by following the link above. Using the  &amp;lt;iframe&amp;gt; tag our dashboard can be embedded into any other site.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;&amp;lt;iframe id=&amp;quot;igraph&amp;quot; scrolling=&amp;quot;no&amp;quot; style=&amp;quot;border:none;&amp;quot; seamless=&amp;quot;seamless&amp;quot; src=&amp;quot;http://dashboard1-env.eba-fvfdgmks.us-east-2.elasticbeanstalk.com/&amp;quot; height=&amp;quot;1100&amp;quot; width=&amp;quot;800&amp;quot;&amp;gt;&amp;lt;/iframe&amp;gt;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;As a result, you can see the following dashboard:&lt;/p&gt;
&lt;div class="embed-responsive embed-responsive-4by3" style="min-width:500"&gt;&lt;iframe id="igraph" scrolling="no" style="border:none;" seamless="seamless" src="http://webappsdashboard-env.eba-z4jtzsq2.sa-east-1.elasticbeanstalk.com/" height="1100" width="800"&gt;&lt;/iframe&gt;
&lt;/div&gt;&lt;p&gt;&lt;a href="https://github.com/valiotti/leftjoin/tree/master/dashboard_deployment"&gt;View the code on Github&lt;/a&gt;&lt;/p&gt;
</description>
<pubDate>Thu, 06 Aug 2020 09:44:07 +0300</pubDate>
</item>

<item>
<title>Building a Plotly Dashboard with dynamic sliders in Python</title>
<guid isPermaLink="false">38</guid>
<link>https://en.leftjoin.ru/all/building-a-plotly-dashboard-with-dynamic-sliders-in-python/</link>
<comments>https://en.leftjoin.ru/all/building-a-plotly-dashboard-with-dynamic-sliders-in-python/</comments>
<description>
&lt;p&gt;Recently we discussed how to use Plotly and built a scatter plot to display the ratio between the number of reviews and the average rating for Russian Breweries registered on Untappd. Each marker on the plot has two properties, the registration period and the beer range. And today we are going to introduce you to Dash, a Python framework for building analytical web applications. First, create a new file name app.py with a get_scatter_plot(n_days, top_n) function from the previous article.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;import dash
import dash_core_components as dcc
import dash_html_components as html
from get_plots import get_scatter_plot&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;After importing  the necessary libraries we need to load CSS styles and initiate our web app:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;external_stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css']
app = dash.Dash(__name__, external_stylesheets=external_stylesheets)&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Create a dashboard structure:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;app.layout = html.Div(children=[
       html.Div([
           dcc.Graph(id='fig1'),
       ]) ,
       html.Div([
           html.H6('Time period (days)'),
           dcc.Slider(
               id='slider-day1',
               min=0,
               max=100,
               step=1,
               value=30,
               marks={i: str(i) for i in range(0, 100, 10)}
           ),
           html.H6('Number of breweries from the top'),
           dcc.Slider(
               id='slider-top1',
               min=0,
               max=500,
               step=50,
               value=500,
               marks={i: str(i) for i in range(0, 500, 50)})
       ])
])&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Now we have a plot and two sliders, each with its id and parameters: minimum value, maximum value, step, and initial value. Since the sliders data will be displayed in the plot we need to create a callback. Output is the first argument that displays our plot, the following Input parameters accept values on which the plot depends.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;@app.callback(
   dash.dependencies.Output('fig1', 'figure'),
   [dash.dependencies.Input('slider-day1', 'value'),
    dash.dependencies.Input('slider-top1', 'value')])
def output_fig(n_days, top_n):
    get_scatter_plot(n_days, top_n)&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;At the end of our script we will add the following line to run our code :&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;if __name__ == '__main__':
   app.run_server(debug=True)&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Now, whenever the script is running our local IP address will be displayed in the terminal. Let’s open it in a web browser to view our interactive dashboard, it’s updated automatically when moving the sliders.&lt;/p&gt;
&lt;div class="embed-responsive embed-responsive-4by3" style="min-width:800"&gt;&lt;iframe id="igraph" scrolling="no" style="border:none;"seamless="seamless" src="http://dasheng-env.eba-ueep9ck7.us-east-2.elasticbeanstalk.com" height="1100" width="800" &gt;&lt;/iframe&gt;
&lt;/div&gt;</description>
<pubDate>Mon, 03 Aug 2020 09:05:31 +0300</pubDate>
</item>

<item>
<title>Building a scatter plot for Untappd Breweries</title>
<guid isPermaLink="false">37</guid>
<link>https://en.leftjoin.ru/all/building-a-scatter-plot-for-untappd-breweries/</link>
<comments>https://en.leftjoin.ru/all/building-a-scatter-plot-for-untappd-breweries/</comments>
<description>
&lt;p&gt;Today we are going to build a scatter plot for Russian Breweries that would display the ratio between the number of reviews and their average ratings for the past 30 days. Data will be taken from check-ins left by Untappd users who rated beers. To make a plot we need markers with specified color and size. The color will depend on a brewery registration date, thus displaying it’s registration period on Untappd, while the size of a marker correlates with the range of beers represented. This article is the first part of our series dedicated to building dashboards with Plotly.&lt;/p&gt;
&lt;h2&gt;Writing a Clickhouse query&lt;/h2&gt;
&lt;p&gt;First, we need to process the data before using it in our dashboard. Here, we are using public data collected from Untappd. You can find more about this in our previous articles: &lt;a href="https://www.valiotti.com/leftjoin/all/handling-website-buttons-in-selenium/" class="nu"&gt;“&lt;u&gt;Handling website buttons in Selenium&lt;/u&gt;”&lt;/a&gt; and &lt;a href="https://www.valiotti.com/leftjoin/all/example-of-using-dictionaries-in-clickhouse-with-untappd/" class="nu"&gt;“&lt;u&gt;Example of using dictionaries in Clickhouse with Untappd&lt;/u&gt;”&lt;/a&gt;.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;from datetime import datetime, timedelta
from clickhouse_driver import Client
import plotly.graph_objects as go
import pandas as pd
import numpy as np
client = Client(host='ec1-2-34-567-89.us-east-2.compute.amazonaws.com', user='default', password='', port='9000', database='default')&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Our scatter plot will depend on the  &lt;span class="inline-code"&gt;get_scatter_plot(n_days, top_n)&lt;/span&gt; function, which takes two arguments denoting a time span and a number of breweries to display. Let’s write a SQL query to calculate the Brewery Pure Average. It can be presented &lt;a href="https://help.untappd.com/hc/en-us/articles/360034136372-How-are-ratings-determined-on-Untappd-"&gt;the following&lt;/a&gt;: multiply the beer rating by the total number of ratings and divide it by the number of brewery reviews. We will also pass a brewery name and its beer range to the query, these parameters can be fetched from our dictionary using the  &lt;span class="inline-code"&gt;dictGet&lt;/span&gt; function. We are only interested in those breweries that have Brewery Pure Average &gt; 0 and the number of reviews &gt; 100.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;brewery_pure_average = client.execute(f&amp;quot;&amp;quot;&amp;quot;
SELECT
       t1.brewery_id,
       sum(t1.beer_pure_average_mult_count / t2.count_for_that_brewery) AS brewery_pure_average,
       t2.count_for_that_brewery,
       dictGet('breweries', 'brewery_name', toUInt64(t1.brewery_id)),
       dictGet('breweries', 'beer_count', toUInt64(t1.brewery_id)),
       t3.stats_age_on_service / 365
   FROM
   (
       SELECT
           beer_id,
           brewery_id,
           sum(rating_score) AS beer_pure_average_mult_count
       FROM beer_reviews
       WHERE created_at &amp;gt;= today()-{n_days}
       GROUP BY
           beer_id,
           brewery_id
   ) AS t1
   ANY LEFT JOIN
   (
       SELECT
           brewery_id,
           count(rating_score) AS count_for_that_brewery
       FROM beer_reviews
       WHERE created_at &amp;gt;= today()-{n_days}
       GROUP BY brewery_id
   ) AS t2 ON t1.brewery_id = t2.brewery_id
   ANY LEFT JOIN
   (
       SELECT
           brewery_id,
           stats_age_on_service
       FROM brewery_info
   ) AS t3 ON t1.brewery_id = t3.brewery_id
   GROUP BY
       t1.brewery_id,
       t2.count_for_that_brewery,
       t3.stats_age_on_service
   HAVING t2.count_for_that_brewery &amp;gt;= 150
   ORDER BY brewery_pure_average
   LIMIT {top_n}
    &amp;quot;&amp;quot;&amp;quot;)

scatter_plot_df_with_age = pd.DataFrame(brewery_pure_average, columns=['brewery_id', 'brewery_pure_average', 'rating_count', 'brewery_name', 'beer_count'])&lt;/code&gt;&lt;/pre&gt;&lt;h2&gt;Working with a DataFrame&lt;/h2&gt;
&lt;p&gt;Add two dotted lines that will pass through the median values of each axis. That way we can find out which breweries are above average, the best ones will be in the upper right area.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;dict_list = []
dict_list.append(dict(type=&amp;quot;line&amp;quot;,
                     line=dict(
                         color=&amp;quot;#666666&amp;quot;,
                         dash=&amp;quot;dot&amp;quot;),
                     x0=0,
                     y0=np.median(scatter_plot_df_with_age.brewery_pure_average),
                     x1=7000,
                     y1=np.median(scatter_plot_df_with_age.brewery_pure_average),
                     line_width=1,
                     layer=&amp;quot;below&amp;quot;))
dict_list.append(dict(type=&amp;quot;line&amp;quot;,
                     line=dict(
                         color=&amp;quot;#666666&amp;quot;,
                         dash=&amp;quot;dot&amp;quot;),
                     x0=np.median(scatter_plot_df_with_age.rating_count),
                     y0=0,
                     x1=np.median(scatter_plot_df_with_age.rating_count),
                     y1=5,
                     line_width=1,
                     layer=&amp;quot;below&amp;quot;))&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Add annotations to display median values by hovering:&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;annotations_list = []
annotations_list.append(
    dict(
        x=8000,
        y=np.median(scatter_plot_df_with_age.brewery_pure_average) - 0.1,
        xref=&amp;quot;x&amp;quot;,
        yref=&amp;quot;y&amp;quot;,
        text=f&amp;quot;Median value: {round(np.median(scatter_plot_df_with_age.brewery_pure_average), 2)}&amp;quot;,
        showarrow=False,
        font={
            'family':'Roboto, light',
            'color':'#666666',
            'size':12
        }
    )
)
annotations_list.append(
    dict(
        x=np.median(scatter_plot_df_with_age.rating_count) + 180,
        y=0.8,
        xref=&amp;quot;x&amp;quot;,
        yref=&amp;quot;y&amp;quot;,
        text=f&amp;quot;Median value: {round(np.median(scatter_plot_df_with_age.rating_count), 2)}&amp;quot;,
        showarrow=False,
        font={
            'family':'Roboto, light',
            'color':'#666666',
            'size':12
        },
        textangle=-90
    )
)&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Let’s make our plot more informative by splitting breweries into 4 groups according to the beer range. The first group will include breweries with less than 10 brands, the second group for those holding 10-30 brands, the third one for 30-50 brands, and the last one for large breweries with &gt;50 brands. We stored marker sizes in the &lt;span class="inline-code"&gt;bucket_beer_count&lt;/span&gt; list.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;bucket_beer_count = []
for beer_count in scatter_plot_df_with_age.beer_count:
   if beer_count &amp;lt; 10:
       bucket_beer_count.append(7)
   elif 10 &amp;lt;= beer_count &amp;lt;= 30:
       bucket_beer_count.append(9)
   elif 31 &amp;lt;= beer_count &amp;lt;= 50:
       bucket_beer_count.append(11)
   else:
       bucket_beer_count.append(13)
scatter_plot_df_with_age['bucket_beer_count'] = bucket_beer_count&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Next step is to perform age-based splitting&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;bucket_age = []
for age in scatter_plot_df_with_age.age_on_service:
   if age &amp;lt; 4:
       bucket_age.append(0)
   elif 4 &amp;lt;= age &amp;lt;= 6:
       bucket_age.append(1)
   elif 6 &amp;lt; age &amp;lt; 8:
       bucket_age.append(2)
   else:
       bucket_age.append(3)
scatter_plot_df_with_age['bucket_age'] = bucket_age&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Let’s divide our DataFrame into 4 parts to build separate scatter plots with its own color and size.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;scatter_plot_df_0 = scatter_plot_df[scatter_plot_df.bucket == 0]
scatter_plot_df_1 = scatter_plot_df[scatter_plot_df.bucket == 1]
scatter_plot_df_2 = scatter_plot_df[scatter_plot_df.bucket == 2]
scatter_plot_df_3 = scatter_plot_df[scatter_plot_df.bucket == 3]&lt;/code&gt;&lt;/pre&gt;&lt;h2&gt;Plotting&lt;/h2&gt;
&lt;p&gt;Now we are ready to build the plot, add our 4 brewery groups one by one, setting its key parameters: name, marker color, annotation transparency and text.&lt;/p&gt;
&lt;pre class="e2-text-code"&gt;&lt;code&gt;fig = go.Figure()
fig.add_trace(go.Scatter(
    x=scatter_plot_df_0.rating_count,
    y=scatter_plot_df_0.brewery_pure_average,
    name='&amp;lt; 4',
    mode='markers',
    opacity=0.85,
    text=scatter_plot_df_0.name_count,
    marker_color='rgb(114, 183, 178)',
    marker_size=scatter_plot_df_0.bucket_beer_count,
    textfont={&amp;quot;family&amp;quot;:&amp;quot;Roboto, light&amp;quot;,
              &amp;quot;color&amp;quot;:&amp;quot;black&amp;quot;
             }
))

fig.add_trace(go.Scatter(
    x=scatter_plot_df_1.rating_count,
    y=scatter_plot_df_1.brewery_pure_average,
    name='4 – 6',
    mode='markers',
    opacity=0.85,
    marker_color='rgb(76, 120, 168)',
    text=scatter_plot_df_1.name_count,
    marker_size=scatter_plot_df_1.bucket_beer_count,
    textfont={&amp;quot;family&amp;quot;:&amp;quot;Roboto, light&amp;quot;,
              &amp;quot;color&amp;quot;:&amp;quot;black&amp;quot;
             }
))

fig.add_trace(go.Scatter(
    x=scatter_plot_df_2.rating_count,
    y=scatter_plot_df_2.brewery_pure_average,
    name='6 – 8',
    mode='markers',
    opacity=0.85,
    marker_color='rgb(245, 133, 23)',
    text=scatter_plot_df_2.name_count,
    marker_size=scatter_plot_df_2.bucket_beer_count,
    textfont={&amp;quot;family&amp;quot;:&amp;quot;Roboto, light&amp;quot;,
              &amp;quot;color&amp;quot;:&amp;quot;black&amp;quot;
             }
))

fig.add_trace(go.Scatter(
    x=scatter_plot_df_3.rating_count,
    y=scatter_plot_df_3.brewery_pure_average,
    name='8+',
    mode='markers',
    opacity=0.85,
    marker_color='rgb(228, 87, 86)',
    text=scatter_plot_df_3.name_count,
    marker_size=scatter_plot_df_3.bucket_beer_count,
    textfont={&amp;quot;family&amp;quot;:&amp;quot;Roboto, light&amp;quot;,
              &amp;quot;color&amp;quot;:&amp;quot;black&amp;quot;
             }
))

fig.update_layout(
    title=f&amp;quot;The ratio between the number of reviews and the average brewery rating for the past &amp;lt;br&amp;gt; {n_days} days, top {top_n} breweries&amp;quot;,
    font={
            'family':'Roboto, light',
            'color':'black',
            'size':14
        },
    plot_bgcolor='rgba(0,0,0,0)',
    yaxis_title=&amp;quot;Average rating&amp;quot;,
    xaxis_title=&amp;quot;Number of reviews&amp;quot;,
    legend_title_text='Registration period&amp;lt;br&amp;gt; on Untappd in years:',
    height=750,
    shapes=dict_list,
    annotations=annotations_list
)&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Voila, the scatter plot is done! Each point is a separate brewery. The color shows the brewery beer range and when hovering we will see a summary including the average rating for the past 30 days, number of reviews, brewery name, and beer range. The dotted lines are passing through the median values we calculated with NumPy, they’re showing us the best breweries in the upper right. In our next article, we are going to create a breweries dashboard with dynamic parameters.&lt;/p&gt;
&lt;div class="embed-responsive embed-responsive-4by3" style="min-width:500"&gt;&lt;iframe id="igraph" scrolling="no" style="border:none;"seamless="seamless" src="//plotly.com/~i-bond/20.embed?showlink=false" height="800" width="900"&gt;&lt;/iframe&gt;
&lt;/div&gt;</description>
<pubDate>Wed, 15 Jul 2020 15:32:22 +0300</pubDate>
</item>


</channel>
</rss>