{
    "version": "https:\/\/jsonfeed.org\/version\/1",
    "title": "LEFT JOIN: blog on analytics, visualisation & data science, posts tagged: untappd",
    "home_page_url": "https:\/\/en.leftjoin.ru\/tags\/untappd\/",
    "feed_url": "https:\/\/en.leftjoin.ru\/tags\/untappd\/json\/",
    "icon": "https:\/\/en.leftjoin.ru\/user\/userpic@2x.jpg",
    "author": {
        "name": "Nikolay Valiotti",
        "url": "https:\/\/en.leftjoin.ru\/",
        "avatar": "https:\/\/en.leftjoin.ru\/user\/userpic@2x.jpg"
    },
    "items": [
        {
            "id": "43",
            "url": "https:\/\/en.leftjoin.ru\/all\/how-to-build-dashboard-with-bootstrap-4-from-scratch-part-1\/",
            "title": "How to build a dashboard with Bootstrap 4 from scratch (Part 1)",
            "content_html": "<p>In previous articles we reviewed  Plotly’s Dash Framework,  <a href=\"https:\/\/en.leftjoin.ru\/all\/building-a-scatter-plot-for-untappd-breweries\/\">learned to build scatter plots<\/a> and <a href=\"https:\/\/en.leftjoin.ru\/all\/visualizing-covid-19-in-russia-with-plotly\/\"> create a map visualization<\/a>. This time we will summarize our knowledge and put all the pieces together to design a dashboard layout using the Bootstrap 4 grid system.<br \/>\nTo facilitate the development, we’ll refer to the <a href=\"https:\/\/dash-bootstrap-components.opensource.faculty.ai\/\">dash-bootstrap-components<\/a> library. This is a great tool that integrates Bootstrap in Dash, allowing us to write web pages in pure Python, and add any Bootstrap components and styling.<\/p>\n<h2>Draft Layout<\/h2>\n<p>Before we begin coding it’s crucial to have a plan of our app, a rough layout that would help us to see the big picture and quickly modify the structure.  We used <a href=\"https:\/\/app.diagrams.net\/\">draw.io<\/a> to make a dashboard draft, this application enables to create diagrams, graphs,  flowcharts, and forms at the click of a button. The dashboard will be built according to this template:<\/p>\n<div class=\"e2-text-picture\">\n<img src=\"https:\/\/en.leftjoin.ru\/pictures\/template_1@2x.png\" width=\"872\" height=\"946\" alt=\"\" \/>\n<\/div>\n<p>Like the dashboard itself,  the top header will be colored in gold and white, the main colors of <a href=\"https:\/\/untappd.com\/\">Untappd<\/a>.  Just below the header, there is a section with breweries, which includes a scatter plot and a control panel.  And at the bottom of the page, there will be a map showing beverage rating across the regions of Russia.<\/p>\n<p>All right, let’s get started, first create a new python file with the name application.py. The file will store all the front end components of the dashboard, and create a new directory named assets. The directory structure should be similar:<\/p>\n<pre class=\"e2-text-code\"><code>- application.py\r\n- assets\/\r\n    |-- typography.css\r\n    |-- header.css\r\n    |-- custom-script.js\r\n    |-- image.png<\/code><\/pre><p>Then we import the libraries and initialize our application:<\/p>\n<pre class=\"e2-text-code\"><code>import dash\r\nimport dash_bootstrap_components as dbc\r\nimport dash_html_components as html\r\nimport dash_core_components as dcc\r\nimport pandas as pd\r\nfrom get_ratio_scatter_plot import get_plot\r\nfrom get_russian_map import get_map\r\nfrom clickhouse_driver import Client\r\nfrom dash.dependencies import Input, Output\r\n\r\nstandard_BS = dbc.themes.BOOTSTRAP\r\napp = dash.Dash(__name__, external_stylesheets=[standard_BS])<\/code><\/pre><p>Main parameters of the app:<br \/>\n<span class=\"inline-code\">__name__<\/span>  — to enable access to static elements stored in the assets folder (such as images, CSS and JS files)<br \/>\n<span class=\"inline-code\">external_stylesheets<\/span> — external CSS styling,  here we are using a standard Bootstrap theme, however you can create your own theme  or use any of <a href=\"https:\/\/www.bootstrapcdn.com\/bootswatch\/\"> the availables ones<\/a>.<\/p>\n<p>Hook up a few more things to work with local files  and connect to the Clickhouse Database:<\/p>\n<pre class=\"e2-text-code\"><code>app.scripts.config.serve_locally = True\r\napp.css.config.serve_locally = True\r\n\r\nclient = Client(host='ec2-3-16-148-63.us-east-2.compute.amazonaws.com',\r\n                user='default',\r\n                password='',\r\n                port='9000',\r\n                database='default')<\/code><\/pre><p>Add a palette of colors:<\/p>\n<pre class=\"e2-text-code\"><code>colors = ['#ffcc00', \r\n          '#f5f2e8', \r\n          '#f8f3e3',\r\n          '#ffffff', \r\n          ]<\/code><\/pre><h2>Creating a layout<\/h2>\n<p>All the dashboard elements will be placed within a Bootstrap container,  which is in the  &lt;div&gt block:<\/p>\n<pre class=\"e2-text-code\"><code>- app \r\n    |-- div\r\n     |-- container\r\n      |-- logo&amp;header\r\n     |-- container\r\n      |-- div\r\n       |-- controls&amp;scatter\r\n       |-- map<\/code><\/pre><pre class=\"e2-text-code\"><code>app.layout = html.Div(\r\n                    [\r\n                        dbc.Container(\r\n\r\n                                         &lt; header&gt;\r\n                         \r\n                        dbc.Container(       \r\n                            html.Div(\r\n                                [\r\n                        \r\n                                    &lt; body &gt;\r\n                        \r\n                                ],\r\n                            ),\r\n                            fluid=False, style={'max-width': '1300px'},\r\n                        ),\r\n                    ],\r\n                    style={'background-color': colors[1], 'font-family': 'Proxima Nova Bold'},\r\n                )<\/code><\/pre><p>Here we set a fixed container width, background color, and font style of the page that is stored in typography.css in the assets folder. Let’s take a closer look at the first element in the div block,  that’s the top header with the Untappd logo:<\/p>\n<pre class=\"e2-text-code\"><code>logo = html.Img(src=app.get_asset_url('logo.png'),\r\n                        style={'width': &quot;128px&quot;, 'height': &quot;128px&quot;,\r\n                        }, className='inline-image')<\/code><\/pre><p>and the header:<\/p>\n<pre class=\"e2-text-code\"><code>header = html.H3(&quot;Russian breweries stats from Untappd&quot;, style={'text-transform': &quot;uppercase&quot;})<\/code><\/pre><p>We used Bootstrap Forms to position these two elements on the same level.<\/p>\n<pre class=\"e2-text-code\"><code>logo_and_header = dbc.FormGroup(\r\n        [\r\n            logo,\r\n            html.Div(\r\n                [\r\n                    header\r\n                ],\r\n                className=&quot;p-5&quot;\r\n            )\r\n        ],\r\n        className='form-row',\r\n)<\/code><\/pre><p>The class name  ‘p-5’ allows to increase padding and vertically align the title while specifying ‘form-row’  as the form class name we put the logo and header in one row.  At this point, the top header should  look the following:<\/p>\n<div class=\"e2-text-picture\">\n<img src=\"https:\/\/en.leftjoin.ru\/pictures\/logo_and_header.png\" width=\"2132\" height=\"242\" alt=\"\" \/>\n<\/div>\n<p>Now we need to center the elements and add some colors.  Create a separate container that will take one row. Specify <span class=\"inline-code\">‘d-flex justify-content-center’<\/span> in the <span class=\"inline-code\">className<\/span>  to achieve the same output.<\/p>\n<pre class=\"e2-text-code\"><code>dbc.Container(\r\n                    dbc.Row(\r\n                        [\r\n                            dbc.Col(\r\n                                html.Div(\r\n                                    logo_and_header,\r\n                                ),\r\n                            ),\r\n                        ],\r\n                        style={'max-height': '128px',\r\n                               'color': 'white',\r\n                       }\r\n\r\n                    ),\r\n                    className='d-flex justify-content-center',\r\n                    style={'max-width': '100%',\r\n                           'background-color': colors[0]},\r\n                ),<\/code><\/pre><p>And now the top header is done:<\/p>\n<div class=\"e2-text-picture\">\n<img src=\"https:\/\/en.leftjoin.ru\/pictures\/top-header.png\" width=\"2200\" height=\"245\" alt=\"\" \/>\n<\/div>\n<p>We’re approaching the main part, create the next Bootstrap Container and add a subheading:<\/p>\n<pre class=\"e2-text-code\"><code>dbc.Container(\r\n                    html.Div(\r\n                        [\r\n                            html.Br(),\r\n                            html.H5(&quot;Breweries&quot;, style={'text-align':'center', 'text-transform': 'uppercase'}),\r\n                            html.Hr(), # horizontal  break<\/code><\/pre><p>The main body will consist of Bootstrap Cards, they can provide a structured layout of all parts,  giving each element a clear border and saving the white space. Create the next element, a control panel with sliders:<\/p>\n<pre class=\"e2-text-code\"><code>slider_day_values = [1, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]\r\nslider_top_breweries_values = [5, 25, 50, 75, 100, 125, 150, 175, 200]\r\n\r\ncontrols = dbc.Card(\r\n    [\r\n       dbc.CardBody(\r\n           [\r\n               dbc.FormGroup(\r\n                    [\r\n                        dbc.Label(&quot;Time Period&quot;, style={'text-align': 'center', 'font-size': '100%', 'text-transform': 'uppercase'}),\r\n                        dcc.Slider(\r\n                            id='slider-day',\r\n                            min=1,\r\n                            max=100,\r\n                            step=10,\r\n                            value=100,\r\n                            marks={i: i for i in slider_day_values}\r\n                        ),\r\n                    ], style={'text-align': 'center'}\r\n               ),\r\n               dbc.FormGroup(\r\n                    [\r\n                        dbc.Label(&quot;Number of breweries&quot;, style={'text-align': 'center', 'font-size': '100%', 'text-transform': 'uppercase'}),\r\n                        dcc.Slider(\r\n                            id='slider-top-breweries',\r\n                            min=5,\r\n                            max=200,\r\n                            step=5,\r\n                            value=200,\r\n                            marks={i: i for i in slider_top_breweries_values}\r\n                        ),\r\n                    ], style={'text-align': 'center'}\r\n               ),\r\n           ],\r\n       )\r\n    ],\r\n    style={'height': '32.7rem', 'background-color': colors[3]}\r\n)<\/code><\/pre><div class=\"e2-text-picture\">\n<img src=\"https:\/\/en.leftjoin.ru\/pictures\/2@2x.png\" width=\"291.5\" height=\"149\" alt=\"\" \/>\n<\/div>\n<p>The control panel consists of two sliders that can be used to change the view on the scatter, they are positioned one below the other in a Bootstrap Form. The sliders were put inside the dbc.CardBody block, other elements will be added in the same way. It allows to eliminate alignment problem and achieve clear borders.  By default, the sliders are painted in blue, but we can easily customize them by changing the properties of the class in sliders.css.  Add the control panel with the scatter plot as follows:<\/p>\n<pre class=\"e2-text-code\"><code>dbc.Row(\r\n                [\r\n                    dbc.Col(controls, width={&quot;size&quot;: 4,\r\n                                     &quot;order&quot;: 'first',\r\n                                             &quot;offset&quot;: 0},\r\n                     ),\r\n                     dbc.Col(dbc.Card(\r\n                                [\r\n                                    dbc.CardBody(\r\n                                        [\r\n                                            html.H6(&quot;The ratio between the number of reviews and the average brewery rating&quot;,\r\n                                                    className=&quot;card-title&quot;,\r\n                                                    style={'text-transform': 'uppercase'}), \r\n                                            dcc.Graph(id='ratio-scatter-plot'),\r\n                                        ],\r\n                                    ),\r\n                                ],\r\n                                style={'background-color': colors[2], 'text-align':'center'}\r\n                             ),\r\n                     md=8),\r\n                ],\r\n                align=&quot;start&quot;,\r\n                justify='center',\r\n            ),\r\nhtml.Br(),<\/code><\/pre><p>And at the bottom of the page we will position the map:<\/p>\n<pre class=\"e2-text-code\"><code>html.H5(&quot;Venues and Regions&quot;, style={'text-align':'center', 'text-transform': 'uppercase',}),\r\n                            html.Hr(), # horizontal  break\r\n                            dbc.Row(\r\n                                [\r\n                                    dbc.Col(\r\n                                        dbc.Card(\r\n                                            [\r\n                                                dbc.CardBody(\r\n                                                    [\r\n                                                        html.H6(&quot;Average beer rating across regions&quot;,\r\n                                                                className=&quot;card-title&quot;,\r\n                                                                style={'text-transform': 'uppercase'},\r\n                                                        ),  \r\n                                                        dcc.Graph(figure=get_map())\r\n                                                    ],\r\n                                                ),\r\n                                            ],\r\n                                        style={'background-color': colors[2], 'text-align': 'center'}\r\n                                        ),\r\n                                md=12),\r\n                                ]\r\n                            ),\r\n                            html.Br(),<\/code><\/pre><h2>Callbacks in Dash<\/h2>\n<p>Callback functions allow making dashboard elements interactive through the  Input and Output properties of a particular component.<\/p>\n<pre class=\"e2-text-code\"><code>@app.callback(\r\n    Output('ratio-scatter-plot', 'figure'),\r\n    [Input('slider-day', 'value'),\r\n     Input('slider-top-breweries', 'value'),\r\n     ]\r\n)\r\ndef get_scatter_plots(n_days=100, top_n=200):\r\n    if n_days == 100 and top_n == 200:\r\n        df = pd.read_csv('data\/ratio_scatter_plot.csv')\r\n        return get_plot(n_days, top_n, df)\r\n    else:\r\n        return get_plot(n_days, top_n)<\/code><\/pre><p>In this example, our inputs are the “value” properties of the components that have the ids “slider-day’” and  “slider-top-breweries”. Our output is the “children” property of the component with the id “ratio-scatter-plot”. When the input values are changed, the decorator function will be called automatically and the output on the scatter is updated. Learn more about callbacks from <a href=\"https:\/\/dash.plotly.com\/basic-callbacks\/\">the examples in the docs.<\/a><br \/>\nIt’s worth noting, that the scatter plot may not be displayed correctly when the page is loaded. To avoid this scenario we need to specify its initial state and produce a scatter plot from the saved CSV file stored in the data folder.  Then, when changing the slider values, all data will be taken directly from the Clickhouse tables.<\/p>\n<div class=\"e2-text-picture\">\n<div class=\"fotorama\" data-width=\"833\" data-ratio=\"1.595785440613\">\n<img src=\"https:\/\/en.leftjoin.ru\/pictures\/scatter_empty_2@2x.png\" width=\"833\" height=\"522\" alt=\"\" \/>\n<img src=\"https:\/\/en.leftjoin.ru\/pictures\/scatter_2@2x.png\" width=\"828\" height=\"515\" alt=\"\" \/>\n<\/div>\n<\/div>\n<p>Add a few more lines responsible for deployment and our app is ready to run:<\/p>\n<pre class=\"e2-text-code\"><code>application = app.server\r\n\r\nif __name__ == '__main__':\r\n    application.run(debug=True, port=8000)<\/code><\/pre><p>Next, we need to  <a href=\"https:\/\/en.leftjoin.ru\/all\/deploying-analytical-web-app-with-aws-elastic-beanstalk\/\">deploy our app to AWS BeansTalk<\/a> and <a href=\"http:\/\/unappd-part-1-en.us-east-2.elasticbeanstalk.com\/\">the first part of our Bootstrap Dashboard is completed<\/a>:<\/p>\n<div class=\"embed-responsive embed-responsive-4by3\" style=\"min-width:500\"><p><iframe id=\"igraph\" scrolling=\"yes\" style=\"border:none;\"seamless=\"seamless\" src='http:\/\/unappd-part-1-en.us-east-2.elasticbeanstalk.com\/' height=\"1360px\" width=\"1100px\"<\/p>\n<\/iframe><\/div><p>Thanks for reading the first part of our series about Bootstrap Dashboards, in the next one we are going to add more new components, improved callbacks, and talk about tables in Bootstrap.<\/p>\n<p><a href=\"https:\/\/github.com\/valiotti\/leftjoin\/tree\/master\/untappd_dashboard_en%20(part_1)\/\"> View the code on Github<\/a><\/p>\n",
            "date_published": "2020-09-16T16:21:28+03:00",
            "date_modified": "2020-10-29T10:04:36+03:00",
            "image": "https:\/\/en.leftjoin.ru\/pictures\/template_1.png",
            "_date_published_rfc2822": "Wed, 16 Sep 2020 16:21:28 +0300",
            "_rss_guid_is_permalink": "false",
            "_rss_guid": "43",
            "_e2_data": {
                "is_favourite": false,
                "links_required": [
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/fotorama\/fotorama.css",
                    "system\/library\/fotorama\/fotorama.js",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css"
                ],
                "og_images": [
                    "https:\/\/en.leftjoin.ru\/pictures\/template_1.png",
                    "https:\/\/en.leftjoin.ru\/pictures\/template_1@2x.png",
                    "https:\/\/en.leftjoin.ru\/pictures\/logo_and_header.png",
                    "https:\/\/en.leftjoin.ru\/pictures\/top-header.png",
                    "https:\/\/en.leftjoin.ru\/pictures\/2@2x.png",
                    "https:\/\/en.leftjoin.ru\/pictures\/scatter_empty_2@2x.png",
                    "https:\/\/en.leftjoin.ru\/pictures\/scatter_2@2x.png"
                ]
            }
        },
        {
            "id": "38",
            "url": "https:\/\/en.leftjoin.ru\/all\/building-a-plotly-dashboard-with-dynamic-sliders-in-python\/",
            "title": "Building a Plotly Dashboard with dynamic sliders in Python",
            "content_html": "<p>Recently we discussed how to use Plotly and built a scatter plot to display the ratio between the number of reviews and the average rating for Russian Breweries registered on Untappd. Each marker on the plot has two properties, the registration period and the beer range. And today we are going to introduce you to Dash, a Python framework for building analytical web applications. First, create a new file name app.py with a get_scatter_plot(n_days, top_n) function from the previous article.<\/p>\n<pre class=\"e2-text-code\"><code>import dash\r\nimport dash_core_components as dcc\r\nimport dash_html_components as html\r\nfrom get_plots import get_scatter_plot<\/code><\/pre><p>After importing  the necessary libraries we need to load CSS styles and initiate our web app:<\/p>\n<pre class=\"e2-text-code\"><code>external_stylesheets = ['https:\/\/codepen.io\/chriddyp\/pen\/bWLwgP.css']\r\napp = dash.Dash(__name__, external_stylesheets=external_stylesheets)<\/code><\/pre><p>Create a dashboard structure:<\/p>\n<pre class=\"e2-text-code\"><code>app.layout = html.Div(children=[\r\n       html.Div([\r\n           dcc.Graph(id='fig1'),\r\n       ]) ,\r\n       html.Div([\r\n           html.H6('Time period (days)'),\r\n           dcc.Slider(\r\n               id='slider-day1',\r\n               min=0,\r\n               max=100,\r\n               step=1,\r\n               value=30,\r\n               marks={i: str(i) for i in range(0, 100, 10)}\r\n           ),\r\n           html.H6('Number of breweries from the top'),\r\n           dcc.Slider(\r\n               id='slider-top1',\r\n               min=0,\r\n               max=500,\r\n               step=50,\r\n               value=500,\r\n               marks={i: str(i) for i in range(0, 500, 50)})\r\n       ])\r\n])<\/code><\/pre><p>Now we have a plot and two sliders, each with its id and parameters: minimum value, maximum value, step, and initial value. Since the sliders data will be displayed in the plot we need to create a callback. Output is the first argument that displays our plot, the following Input parameters accept values on which the plot depends.<\/p>\n<pre class=\"e2-text-code\"><code>@app.callback(\r\n   dash.dependencies.Output('fig1', 'figure'),\r\n   [dash.dependencies.Input('slider-day1', 'value'),\r\n    dash.dependencies.Input('slider-top1', 'value')])\r\ndef output_fig(n_days, top_n):\r\n    get_scatter_plot(n_days, top_n)<\/code><\/pre><p>At the end of our script we will add the following line to run our code :<\/p>\n<pre class=\"e2-text-code\"><code>if __name__ == '__main__':\r\n   app.run_server(debug=True)<\/code><\/pre><p>Now, whenever the script is running our local IP address will be displayed in the terminal. Let’s open it in a web browser to view our interactive dashboard, it’s updated automatically when moving the sliders.<\/p>\n<div class=\"embed-responsive embed-responsive-4by3\" style=\"min-width:800\"><iframe id=\"igraph\" scrolling=\"no\" style=\"border:none;\"seamless=\"seamless\" src=\"http:\/\/dasheng-env.eba-ueep9ck7.us-east-2.elasticbeanstalk.com\" height=\"1100\" width=\"800\" ><\/iframe>\n<\/div>",
            "date_published": "2020-08-03T09:05:31+03:00",
            "date_modified": "2020-08-07T14:43:05+03:00",
            "_date_published_rfc2822": "Mon, 03 Aug 2020 09:05:31 +0300",
            "_rss_guid_is_permalink": "false",
            "_rss_guid": "38",
            "_e2_data": {
                "is_favourite": false,
                "links_required": [
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css"
                ],
                "og_images": []
            }
        },
        {
            "id": "37",
            "url": "https:\/\/en.leftjoin.ru\/all\/building-a-scatter-plot-for-untappd-breweries\/",
            "title": "Building a scatter plot for Untappd Breweries",
            "content_html": "<p>Today we are going to build a scatter plot for Russian Breweries that would display the ratio between the number of reviews and their average ratings for the past 30 days. Data will be taken from check-ins left by Untappd users who rated beers. To make a plot we need markers with specified color and size. The color will depend on a brewery registration date, thus displaying it’s registration period on Untappd, while the size of a marker correlates with the range of beers represented. This article is the first part of our series dedicated to building dashboards with Plotly.<\/p>\n<h2>Writing a Clickhouse query<\/h2>\n<p>First, we need to process the data before using it in our dashboard. Here, we are using public data collected from Untappd. You can find more about this in our previous articles: <a href=\"https:\/\/www.valiotti.com\/leftjoin\/all\/handling-website-buttons-in-selenium\/\" class=\"nu\">“<u>Handling website buttons in Selenium<\/u>”<\/a> and <a href=\"https:\/\/www.valiotti.com\/leftjoin\/all\/example-of-using-dictionaries-in-clickhouse-with-untappd\/\" class=\"nu\">“<u>Example of using dictionaries in Clickhouse with Untappd<\/u>”<\/a>.<\/p>\n<pre class=\"e2-text-code\"><code>from datetime import datetime, timedelta\r\nfrom clickhouse_driver import Client\r\nimport plotly.graph_objects as go\r\nimport pandas as pd\r\nimport numpy as np\r\nclient = Client(host='ec1-2-34-567-89.us-east-2.compute.amazonaws.com', user='default', password='', port='9000', database='default')<\/code><\/pre><p>Our scatter plot will depend on the  <span class=\"inline-code\">get_scatter_plot(n_days, top_n)<\/span> function, which takes two arguments denoting a time span and a number of breweries to display. Let’s write a SQL query to calculate the Brewery Pure Average. It can be presented <a href=\"https:\/\/help.untappd.com\/hc\/en-us\/articles\/360034136372-How-are-ratings-determined-on-Untappd-\">the following<\/a>: multiply the beer rating by the total number of ratings and divide it by the number of brewery reviews. We will also pass a brewery name and its beer range to the query, these parameters can be fetched from our dictionary using the  <span class=\"inline-code\">dictGet<\/span> function. We are only interested in those breweries that have Brewery Pure Average > 0 and the number of reviews > 100.<\/p>\n<pre class=\"e2-text-code\"><code>brewery_pure_average = client.execute(f&quot;&quot;&quot;\r\nSELECT\r\n       t1.brewery_id,\r\n       sum(t1.beer_pure_average_mult_count \/ t2.count_for_that_brewery) AS brewery_pure_average,\r\n       t2.count_for_that_brewery,\r\n       dictGet('breweries', 'brewery_name', toUInt64(t1.brewery_id)),\r\n       dictGet('breweries', 'beer_count', toUInt64(t1.brewery_id)),\r\n       t3.stats_age_on_service \/ 365\r\n   FROM\r\n   (\r\n       SELECT\r\n           beer_id,\r\n           brewery_id,\r\n           sum(rating_score) AS beer_pure_average_mult_count\r\n       FROM beer_reviews\r\n       WHERE created_at &gt;= today()-{n_days}\r\n       GROUP BY\r\n           beer_id,\r\n           brewery_id\r\n   ) AS t1\r\n   ANY LEFT JOIN\r\n   (\r\n       SELECT\r\n           brewery_id,\r\n           count(rating_score) AS count_for_that_brewery\r\n       FROM beer_reviews\r\n       WHERE created_at &gt;= today()-{n_days}\r\n       GROUP BY brewery_id\r\n   ) AS t2 ON t1.brewery_id = t2.brewery_id\r\n   ANY LEFT JOIN\r\n   (\r\n       SELECT\r\n           brewery_id,\r\n           stats_age_on_service\r\n       FROM brewery_info\r\n   ) AS t3 ON t1.brewery_id = t3.brewery_id\r\n   GROUP BY\r\n       t1.brewery_id,\r\n       t2.count_for_that_brewery,\r\n       t3.stats_age_on_service\r\n   HAVING t2.count_for_that_brewery &gt;= 150\r\n   ORDER BY brewery_pure_average\r\n   LIMIT {top_n}\r\n    &quot;&quot;&quot;)\r\n\r\nscatter_plot_df_with_age = pd.DataFrame(brewery_pure_average, columns=['brewery_id', 'brewery_pure_average', 'rating_count', 'brewery_name', 'beer_count'])<\/code><\/pre><h2>Working with a DataFrame<\/h2>\n<p>Add two dotted lines that will pass through the median values of each axis. That way we can find out which breweries are above average, the best ones will be in the upper right area.<\/p>\n<pre class=\"e2-text-code\"><code>dict_list = []\r\ndict_list.append(dict(type=&quot;line&quot;,\r\n                     line=dict(\r\n                         color=&quot;#666666&quot;,\r\n                         dash=&quot;dot&quot;),\r\n                     x0=0,\r\n                     y0=np.median(scatter_plot_df_with_age.brewery_pure_average),\r\n                     x1=7000,\r\n                     y1=np.median(scatter_plot_df_with_age.brewery_pure_average),\r\n                     line_width=1,\r\n                     layer=&quot;below&quot;))\r\ndict_list.append(dict(type=&quot;line&quot;,\r\n                     line=dict(\r\n                         color=&quot;#666666&quot;,\r\n                         dash=&quot;dot&quot;),\r\n                     x0=np.median(scatter_plot_df_with_age.rating_count),\r\n                     y0=0,\r\n                     x1=np.median(scatter_plot_df_with_age.rating_count),\r\n                     y1=5,\r\n                     line_width=1,\r\n                     layer=&quot;below&quot;))<\/code><\/pre><p>Add annotations to display median values by hovering:<\/p>\n<pre class=\"e2-text-code\"><code>annotations_list = []\r\nannotations_list.append(\r\n    dict(\r\n        x=8000,\r\n        y=np.median(scatter_plot_df_with_age.brewery_pure_average) - 0.1,\r\n        xref=&quot;x&quot;,\r\n        yref=&quot;y&quot;,\r\n        text=f&quot;Median value: {round(np.median(scatter_plot_df_with_age.brewery_pure_average), 2)}&quot;,\r\n        showarrow=False,\r\n        font={\r\n            'family':'Roboto, light',\r\n            'color':'#666666',\r\n            'size':12\r\n        }\r\n    )\r\n)\r\nannotations_list.append(\r\n    dict(\r\n        x=np.median(scatter_plot_df_with_age.rating_count) + 180,\r\n        y=0.8,\r\n        xref=&quot;x&quot;,\r\n        yref=&quot;y&quot;,\r\n        text=f&quot;Median value: {round(np.median(scatter_plot_df_with_age.rating_count), 2)}&quot;,\r\n        showarrow=False,\r\n        font={\r\n            'family':'Roboto, light',\r\n            'color':'#666666',\r\n            'size':12\r\n        },\r\n        textangle=-90\r\n    )\r\n)<\/code><\/pre><p>Let’s make our plot more informative by splitting breweries into 4 groups according to the beer range. The first group will include breweries with less than 10 brands, the second group for those holding 10-30 brands, the third one for 30-50 brands, and the last one for large breweries with >50 brands. We stored marker sizes in the <span class=\"inline-code\">bucket_beer_count<\/span> list.<\/p>\n<pre class=\"e2-text-code\"><code>bucket_beer_count = []\r\nfor beer_count in scatter_plot_df_with_age.beer_count:\r\n   if beer_count &lt; 10:\r\n       bucket_beer_count.append(7)\r\n   elif 10 &lt;= beer_count &lt;= 30:\r\n       bucket_beer_count.append(9)\r\n   elif 31 &lt;= beer_count &lt;= 50:\r\n       bucket_beer_count.append(11)\r\n   else:\r\n       bucket_beer_count.append(13)\r\nscatter_plot_df_with_age['bucket_beer_count'] = bucket_beer_count<\/code><\/pre><p>Next step is to perform age-based splitting<\/p>\n<pre class=\"e2-text-code\"><code>bucket_age = []\r\nfor age in scatter_plot_df_with_age.age_on_service:\r\n   if age &lt; 4:\r\n       bucket_age.append(0)\r\n   elif 4 &lt;= age &lt;= 6:\r\n       bucket_age.append(1)\r\n   elif 6 &lt; age &lt; 8:\r\n       bucket_age.append(2)\r\n   else:\r\n       bucket_age.append(3)\r\nscatter_plot_df_with_age['bucket_age'] = bucket_age<\/code><\/pre><p>Let’s divide our DataFrame into 4 parts to build separate scatter plots with its own color and size.<\/p>\n<pre class=\"e2-text-code\"><code>scatter_plot_df_0 = scatter_plot_df[scatter_plot_df.bucket == 0]\r\nscatter_plot_df_1 = scatter_plot_df[scatter_plot_df.bucket == 1]\r\nscatter_plot_df_2 = scatter_plot_df[scatter_plot_df.bucket == 2]\r\nscatter_plot_df_3 = scatter_plot_df[scatter_plot_df.bucket == 3]<\/code><\/pre><h2>Plotting<\/h2>\n<p>Now we are ready to build the plot, add our 4 brewery groups one by one, setting its key parameters: name, marker color, annotation transparency and text.<\/p>\n<pre class=\"e2-text-code\"><code>fig = go.Figure()\r\nfig.add_trace(go.Scatter(\r\n    x=scatter_plot_df_0.rating_count,\r\n    y=scatter_plot_df_0.brewery_pure_average,\r\n    name='&lt; 4',\r\n    mode='markers',\r\n    opacity=0.85,\r\n    text=scatter_plot_df_0.name_count,\r\n    marker_color='rgb(114, 183, 178)',\r\n    marker_size=scatter_plot_df_0.bucket_beer_count,\r\n    textfont={&quot;family&quot;:&quot;Roboto, light&quot;,\r\n              &quot;color&quot;:&quot;black&quot;\r\n             }\r\n))\r\n\r\nfig.add_trace(go.Scatter(\r\n    x=scatter_plot_df_1.rating_count,\r\n    y=scatter_plot_df_1.brewery_pure_average,\r\n    name='4 – 6',\r\n    mode='markers',\r\n    opacity=0.85,\r\n    marker_color='rgb(76, 120, 168)',\r\n    text=scatter_plot_df_1.name_count,\r\n    marker_size=scatter_plot_df_1.bucket_beer_count,\r\n    textfont={&quot;family&quot;:&quot;Roboto, light&quot;,\r\n              &quot;color&quot;:&quot;black&quot;\r\n             }\r\n))\r\n\r\nfig.add_trace(go.Scatter(\r\n    x=scatter_plot_df_2.rating_count,\r\n    y=scatter_plot_df_2.brewery_pure_average,\r\n    name='6 – 8',\r\n    mode='markers',\r\n    opacity=0.85,\r\n    marker_color='rgb(245, 133, 23)',\r\n    text=scatter_plot_df_2.name_count,\r\n    marker_size=scatter_plot_df_2.bucket_beer_count,\r\n    textfont={&quot;family&quot;:&quot;Roboto, light&quot;,\r\n              &quot;color&quot;:&quot;black&quot;\r\n             }\r\n))\r\n\r\nfig.add_trace(go.Scatter(\r\n    x=scatter_plot_df_3.rating_count,\r\n    y=scatter_plot_df_3.brewery_pure_average,\r\n    name='8+',\r\n    mode='markers',\r\n    opacity=0.85,\r\n    marker_color='rgb(228, 87, 86)',\r\n    text=scatter_plot_df_3.name_count,\r\n    marker_size=scatter_plot_df_3.bucket_beer_count,\r\n    textfont={&quot;family&quot;:&quot;Roboto, light&quot;,\r\n              &quot;color&quot;:&quot;black&quot;\r\n             }\r\n))\r\n\r\nfig.update_layout(\r\n    title=f&quot;The ratio between the number of reviews and the average brewery rating for the past &lt;br&gt; {n_days} days, top {top_n} breweries&quot;,\r\n    font={\r\n            'family':'Roboto, light',\r\n            'color':'black',\r\n            'size':14\r\n        },\r\n    plot_bgcolor='rgba(0,0,0,0)',\r\n    yaxis_title=&quot;Average rating&quot;,\r\n    xaxis_title=&quot;Number of reviews&quot;,\r\n    legend_title_text='Registration period&lt;br&gt; on Untappd in years:',\r\n    height=750,\r\n    shapes=dict_list,\r\n    annotations=annotations_list\r\n)<\/code><\/pre><p>Voila, the scatter plot is done! Each point is a separate brewery. The color shows the brewery beer range and when hovering we will see a summary including the average rating for the past 30 days, number of reviews, brewery name, and beer range. The dotted lines are passing through the median values we calculated with NumPy, they’re showing us the best breweries in the upper right. In our next article, we are going to create a breweries dashboard with dynamic parameters.<\/p>\n<div class=\"embed-responsive embed-responsive-4by3\" style=\"min-width:500\"><iframe id=\"igraph\" scrolling=\"no\" style=\"border:none;\"seamless=\"seamless\" src=\"\/\/plotly.com\/~i-bond\/20.embed?showlink=false\" height=\"800\" width=\"900\"><\/iframe>\n<\/div>",
            "date_published": "2020-07-15T15:32:22+03:00",
            "date_modified": "2020-07-15T15:37:39+03:00",
            "_date_published_rfc2822": "Wed, 15 Jul 2020 15:32:22 +0300",
            "_rss_guid_is_permalink": "false",
            "_rss_guid": "37",
            "_e2_data": {
                "is_favourite": false,
                "links_required": [
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css",
                    "system\/library\/highlight\/highlight.js",
                    "system\/library\/highlight\/highlight.css"
                ],
                "og_images": []
            }
        }
    ],
    "_e2_version": 3386,
    "_e2_ua_string": "E2 (v3386; Aegea)"
}