2 posts tagged


Collecting Data on Ad Campaigns from VK.com

Время чтения текста – 11 минут

We have a lot to share in today’s longread: we’ll retrieve data on ad campaigns from Vkontakte (widely popular social network in Russia and CIS countries) and compare them to Google Analytics data in Redash. This time we don’t need to create a server, as our data will be transferred to Google Docs via Google Sheets API.

Getting an Access Token
We need to create an app to receive our access token. Follow this link https://vk.com/apps?act=manage and click “Create app” on the developer’s page. Choose a name for your app and check it as a “Standalone app”. Then, click Settings in the left menu and save your app ID.

More details on access tokens can be found here: Getting an access token

Copy this link:


And change YourClientID to your app ID, this will allow you to get information about your advertising account. Open this link in your browser and you will be redirected to another page, which URL address holds your generated access token.

Access token expires in 86400 seconds or 24 hours. If you want to generate a token with an unlimited lifetime period, just pass scope to the offline parameter. In case if you need to generate a new token – change your password account or terminate all active sessions in security settings.

You will also need your advertising account ID to make API requests. It can be found via this link, just copy it:  https://vk.com/ads?act=settings

Using APIs to collect data
Let’s write a script that would allow us to retrieve information on all user’s ad campaigns: number of impressions, сlicks and costs. The script will pass this data to a DataFrame and send it to Google Docs.

from oauth2client.service_account import ServiceAccountCredentials
from pandas import DataFrame
import requests
import gspread
import time

We have several constant variables: access token, advertising account ID and Vkontakte API Version. Here we are using the most recent API version, which is 5.103.

token = 'fa258683fd418fafcab1fb1d41da4ec6cc62f60e152a63140c130a730829b1e0bc'
version = 5.103
id_rk = 123456789

To get advertising stats you need to use the  ads.getStatistics method and pass your ad campaign ID to it. Since we don’t run any advertisements yet, we’ll use the  ads.getAds method that returns IDs of ads and campaigns.

Learn more about the API methods available for Vkontakte here

Use the requests library to send a request and convert the response to JSON.

campaign_ids = []
ads_ids = []
r = requests.get('https://api.vk.com/method/ads.getAds', params={
    'access_token': token,
    'v': version,
    'account_id': id_rk
data = r.json()['response']

We have a familiar list of dictionaries returned, similar to the one we have reviewed in the previous article, “Analysing data on Facebook Ad Campaigns with Redash”.

Fill in the ad_campaign_dict dictionary as follows: specify ad ID as a key, and campaign ID as a value, where this ad belongs to.

ad_campaign_dict = {}
for i in range(len(data)):
    ad_campaign_dict[data[i]['id']] = data[i]['campaign_id']

Having ID for every ad needed we can invoke the  ads.getStatistics method to collect data on the number of impressions, clicks, costs, and dates for a particular ad, so create several empty lists in advance.

ads_campaign_list = []
ads_id_list = []
ads_impressions_list = []
ads_clicks_list = []
ads_spent_list = []
ads_day_start_list = []
ads_day_end_list = []

We need to invoke the getStatistics method for each ad separately, let’s refer to the ad_campaign_dict and iterate our requests. Retrieve all-time data by calling the ‘period’ method with the  ‘overall’ value. Some ads may not have impression or clicks if they haven’t been launched yet, this may cause a  KeyError. Let’s recall to the try — except approach to handle this error.

for ad_id in ad_campaign_dict:
        r = requests.get('https://api.vk.com/method/ads.getStatistics', params={
            'access_token': token,
            'v': version,
            'account_id': id_rk,
            'ids_type': 'ad',
            'ids': ad_id,
            'period': 'overall',
            'date_from': '0',
            'date_to': '0'
            data_stats = r.json()['response']
            for i in range(len(data_stats)):
                for j in range(len(data_stats[i]['stats'])):
        except KeyError:

Now, create a DataFrame and print out the first 5 data points

df = DataFrame()
df['campaign_id'] = ads_campaign_list
df['ad_id'] = ads_id_list
df['impressions'] = ads_impressions_list
df['clicks'] = ads_clicks_list
df['spent'] = ads_spent_list
df['day_start'] = ads_day_start_list
df['day_end'] = ads_day_end_list

Exporting Data to Google Docs
We’ll need a Google API access token, navigate to https://console.developers.google.com and create one. Choose any name you like, then go to your Dashboard and click “Enable APIs and Services”. Choose Google Drive API from the list, enable it and do exactly the same for Google Sheets API.

After activation you will be redirected to the API control panel. Click Credentials – Create Credentials, click choose data type and create an account. Choosing a role is optional, just proceed and specify JSON as a key type.

After these steps you can download a JSON file with your credentials, we’ll rename it to «credentials.json». On the main page you’ll find the email field – copy your email address.

Go to https://docs.google.com/spreadsheets and create a new file named data, we’ll pass data from our DataFrame to it. Put the  credentials.json file in one directory with the script and continue coding. Add these links to the scope list:

scope = ['https://spreadsheets.google.com/feeds', 'https://www.googleapis.com/auth/drive']

We will use the  ServiceAccountCredentials.from_json_keyfile_name and  gspread.authorize methods available in the  oauth2client and  gspread libraries for authenticaion process. Specify your file name and the scope variable in the ServiceAccountCredentials.from_json_keyfile_name method. The  sheet variable will allow us to send requests to our file in Google Docs.

creds = ServiceAccountCredentials.from_json_keyfile_name('credentials.json', scope)
client = gspread.authorize(creds)
sheet = client.open('data').sheet1

Apply the update_cell method to enter new value in a table cell. It’s worth mentioning that the indexing starts at 0, not 1. With the first loop we’ll move the column names of our DataFrame. And with the following loops we’ll move the rest of our data points. The default limits allow us to make 100 loops for 100 seconds. These restrictions may cause errors and stop our script, that’s why we need to use time.sleep and make the script sleep for 1 second after each loop.

count_of_rows = len(df)
count_of_columns = len(df.columns)
for i in range(count_of_columns):
    sheet.update_cell(1, i + 1, list(df.columns)[i])
for i in range(1, count_of_rows + 1):
    for j in range(count_of_columns):
        sheet.update_cell(i + 1, j + 1, str(df.iloc[i, j]))

In case of success, you’ll get the same table:

Exporting data to Redash

See how you can connect Google Analytics to Redash in this article «How to connect Google Analytics to Redash?».

Having a table with Google Analytics and ad campaigns from Vkontakte exported we can compare them by writing the following query:

    CASE WHEN ga_source LIKE '%vk%' THEN 'vk.com' END AS source,
    SUM(query_49.ga_sessions) AS sessions,
    SUM(query_49.ga_newUsers) AS users
FROM query_49
JOIN query_50
ON query_49.ga_date = query_50.day_start
WHERE query_49.ga_source LIKE '%vk%' AND DATE(query_49.ga_date) BETWEEN '2020-05-16' AND '2020-05-20'
GROUP BY query_49.ga_date, source

ga_source — the traffic source, from which a user was redirected. Use the  CASE method to combine everything that contains “vk” in one column called «vk.com». With the help of JOIN operator we can add the table with the data on ad campaigns, merging by date. Let’s take the day of the last ad campaign and a couple of days after, this will result in the following output:

Now we have a table that reflects how much were spent in ad costs on a certain day, the number of users who viewed this ad, were engaged and redirected to our website, and then completed the sign-up process.

 No comments    613   2020   Analytics engineering   BI-tools   data analytics   longread

Metrics for marketing analytics

Время чтения текста – 19 минут


Today in this update we have a long-read, supported by a telegram-channel Russian marketing on the subject of analytical metrics in marketing. Within the article we’ll discuss what marketing analytics is needed for, which metrics one should operate at calculation of marketing efficiency, how one can structure the work on building marketing reporting. Moreover, we will touch upon high-level KPI, discuss quite popular framework and sort out how to compute important analytical indicators. The article came up rather voluminous, and various abbreviations are used hereby, therefore we couldn’t do without a glossary.

  • Glossary
    • Revenue / Income / Sales — revenue, income (rub. / $ / euro)
    • GMV — Margin (% / rub. )
    • MAU — Monthly active users (persons)
    • WAU — Weekly active users (persons)
    • DAU — Daily active users (persons)
    • Requests — Requests (of advertising) (units)
    • Impressions — Displays (of advertising) (units)
    • Clicks — Clicks (on advertising) (units)
    • FR — Fill rate ( =Impressions / Requests) (%)
    • CTR — click through rate ( =Clicks / Impressions) (%)
    • С1 — conversion first purchase (%)
    • R, R1, R3, R7 — retention (of the 1st, the 3rd, the 7th day) (%)
    • RR (rolling retention) (%)
    • Churn — Outflow (%)
    • ARPU — average revenue per user (rub. / $ / euro)
    • ARPPU — average revenue per paying user (rub. / $ / euro)
    • cARPU — cumulative average revenue per users (rub. / $ / euro)
    • LTV (lifetime value) / CLV (customer lifetime value) — Customer lifetime value
    • ROI — return of investment (%)
    • ROAS — return on advertising spend (%)
    • ROMI — return of marketing investment (%)
    • CPA — cost per action (for example, purchase or installation of an app) (rub. / $ / euro)
    • CPC — cost per click (rub. / $ / euro)
    • CPO — cost per order (rub. / $ / euro)
    • CPS — cost per sale (rub. / $ / euro)
    • CPM (cost per mille) — cost per thousand advertising displays (rub. / $ / euro)
    • CAC — customer acquisition cost (rub. / $ / euro)
    • CARC — customer acquisition and retention cost (rub. / $ / euro)
    • SAC — share of advertising costs (%)

What marketing analytics is needed for?

In order to figure analytical metrics out, for starters we need to find out why we need marketing analytics and which questions it can reply. In general, marketing analytics is research and measurement of marketing activity in quantitative indicators. At that, most often, the aim of these actions is to evaluate the efficiency of marketing, calculate the return on marketing investments in the company.
Marketing analytics helps to find answers to the following questions:

  • How efficiently the marketing budget is allocated?
  • Which ROMI do various marketing channels provide?
  • Which target audience is the most effectively converted?
  • Which communication channels are the most / the least profitable?
  • What is the biggest source of company’s profit?

Marketing analysis is better to be initiated from identification of key business indicators and links therebetween, we’ll talk about it a bit later. In general, work on building of marketing analytics is more like creation of the system of proper metrics, their planning, measurement and reaction to changes of these metrics. More thoroughly the cycle PDCA is described in a book of W.Deming "Way out of crisis", I recommend you to get familiar with it.

Key principles of building proper analytics in marketing

Applying a systematic approach to analysis of data affecting the marketing activity can help marketologists to solve a problem, eliminate pain, provide recommendations for further marketing strategical steps. System approach implies abiding by a number of key principles, without which the analytics will turn out to be incomplete.

The task of marketing data analysis should be performed by a professional, who gets along with basics of mathematical statistics, econometrics and, obviously, can calculate, interpret the results and draw conclusions, relevant for a specific business (i.e. understanding the industry). Only in this case, the analytics can be fruitful, otherwise incorrect conclusions based on the data can even exacerbate the situation, that will lead not to budget optimization, but to devastation thereof.

Solving a task, one should address the data affecting a problem from various points of view. Different indicators, different aggregation of the data will allow to look at the problem objectively. It is desired, that the very same conclusion based on the data is repeated at least two times.

Studying current problems one shouldn’t operate the outdated retrospective data, the world is changing extremely fast, same way as the situation on the market / in a company. The analysis, held one year ago, today might have absolutely different results, therefore one needs to update the reports and data therein on a regular basis.

Results of analysis should be clear for a person from business, not familiar to technical terminology. In a perfect scenario – each report helps to wisely sort a problem out and pushes a reader to obvious conclusions. The situation, when an analyst is forced to dig into a huge pile of graphs, unclear charts and pages with numbers without any conclusions is inacceptable.

Such principles will surely help to hire competent analysts for building proper accountability.

How to structure indicators?

The system of metrics, helping to evaluate the marketing effectiveness, can be built based on the several considerations. One of the key approaches to structuration is customer’s lifecycle. Let’s try to sort it out and speak about one of the interesting frameworks for working on such system of metrics. The following major stages can be allocated in the life cycle of a customer.
1) Audience attraction – work of a marketing specialists starts even prior to the moment when potential audience turns into clients of a company
2) Engagement – stage of conversion of users who came to a website / mobile app into registered customers
3) Monetization – stage of formation of paying users (from the number of registered ones)
4) Retention / Churn – events dedicated to development and retention of the attracted audience, decreasing of the churn level

Method AARRR / Pirate Metrics

In 2007 Dave McClure developed and proposed a method called AARRR – the system of metrics, allowing start-ups to get to the bottom of business indicators. Another name of the method that can be also met, — "pirate metrics" due to the fact that the name is pronounced in a pirate way: "aarrr!".
So, let’s sort the approach out and speak about the metrics, responding to each step of the "funnel". The abbreviation consists of 5 key marketing stages:

  • Аcquisition — corresponds to paragraph 1 above
  • Аctivation — corresponds to paragraph 2 above
  • Retention — corresponds to paragraph 4 above
  • Revenue — corresponds to paragraph 3 above
  • Referral — recommendations (recently introduced stage)

At the entrance to the funnel our target audience that we desire to acquire is placed. Then, we are trying to do everything possible to register a potential buyer and turn him into a registered client (by that moment, a person who visited our site / app should realize the value of our product). Thereafter, a client conducts purchases and returns to us again and again. In the end, if he really likes our product, he will recommend it to his friends / acquaintances.

AARRR-funnel, pirate metrics (image source)

On every level of the funnel you need to choose metrics, describing transition from one state to another, that we can calculate and analyze. Let’s have an insight into each of the steps and metrics that correspond thereto. We’ll be studying each based on the example of real organizations, in order for calculation of indicators to be maximally comprehensible in practice.


Reach of potential customers is a key stage of formation of the new audience. let’s examine this crucial stage based on the example of a some mobile app and channels of traffic attraction. Oftentimes, the audience comes to application from several different sources:

  1. Organic traffic: search Google, Yandex, Bing, etc
  2. Organic mobile traffic: search in Apple Store / Google Play
  3. Commercial traffic: advertising in Facebook / Instagram, context advertising (Adwords), mobile advertising networks.

We’ll discuss it, using the example of an ad in Facebook. Each advertising announcement is targeted on the potential advertising audience, which is called "Reach" within the Facebook terminology. At that, we can optimize impressions of an ad by clicks / conversion / etc. Our task is to get as effective audience as possible for minimal amount of money. Consequently, we need to select metrics that will help us to evaluate efficiency. Let’s study them:

  • Impressions — number of advertisement displays, the indicator itself won’t say a lot and is extremely linked to the volume of potential audience, however we will need it for understanding of other metrics.
  • Clicks — number of clicks on ad in absolute figures also depends on the number of impressions.
  • Installs — number of clients who installed the mobile app
  • CTR — click through rate, calculated as Clicks / Impressions ratio and shows how efficient our ad is from the point of audience’s interest. in other words, what is click through rate of our ad.
  • CR (conversion rate) (= Installs / Clicks) — level of conversion, shows which percent of users have installed the app from those, who clicked on the advertising announcement
  • Spend — amount of money that we’ve spent on this advertisement
  • CPC (= Spend / Clicks) — shows us the cost of one click. We should operate this indicator in comparison with other ads / market benchmarks
  • CPM (= Spend / Impressions * 1000 ) — shows us the cost of thousands of impressions of an add. It is used for comparison with other ads / benchmarks
  • CPI (= Spend / Installs) — unit value of one install
  • Revenue — final revenue that we receive from this advertising announcement / campaign (you need to have tools for the proper attribution)
  • ROAS (= Revenue / Spend) — return on investments to advertising, gross income from a dollar spent. Metric shows the efficiency of advertising campaign from the point of money invested into it. For example, if ROAS is equal to 300%, it means that on every 1$ spent, 3$ were earned, and if ROAS is equal to 30%, it means that on every 1$ spent 30 cents were earned.

Thus, we already have quite good metrics palette that we can work with – examine their dynamics, compare the advertisements between each other and between various traffic sources. For example, a simple table, containing these indicators will already be the first approaching to understanding the advertising efficiency.

Facebook Campaign Efficiency

Advertisement Spend ($) Installs CPI Impressions CPM Clicks CTR CPC ROAS
Creative-1 x x x x x x x x x
Creative-2 x x x x x x x x x

This table can be reconstructed in a way, that we have dates vertically, and campaign is selected from the filter, thus we will start understanding the changes in dynamics of key indicators of traffic acquisition.

Summary: We can measure CTR of various banners and understand which from them is more appealing for the audience. this indicator can be used at A/B testing of the very same banner with selection of the most effective one. At calculation of effectiveness, you need to keep in mind CPC apart from CTR, in order to select not only the most clickable banner, but also not the most expensive one.

Key KPIs, indicators of efficiency in terms of money – CPI / ROAS, the first one shows how cheap / expensive we procure the traffic, and the second one – how well the traffic procured is monetized.


Let’s assume that we are developing a mobile game. Let’s think of what can be activation of user in this case? We have attracted users, who have installed a game to their smartphones. Our next task is to register a user (to make him a player), propose him an introduction tour to pass.
On this stage two metrics can be considered the key ones conversion into a registered user (= Registrations / Installs), conversion into users who passed a tutorial (=Tutorial Users / Installs).

Consequently, these two metrics will show us whether we require way too much from a user on the stage of registration, or, vice versa, the registration is very simple for him. The second metric will show how comprehensible the introduction to the game is, whether users are interested in passing the introduction tour, whether we require enough actions from a user.

Moreover, the last metric can be decomposed, if within the tutorial process a user needs to conduct several actions, we can examine the funnel of conversions into each of actions and understand the problematic spots of activation of new users. After activating our audience, we need to keep it, so we can make money thereafter.


Any organization would prefer to have an active base of loyal clients, who make repeating orders on a regular basis. Due to this fact, it is extremely important to track down a few key metrics: retention rate (or Rolling retention), Churn. I was sorting out construction of retention and rolling retention reports more thoroughly in one of the last blog’s articles.

Sticky Factor can be considered another crucial and fundamental metric — that is the level of engagement of users. Sticky Factor for a week is calculated rather easily: DAU / WAU * 100%. Let’s sort it out in more details based on the last example. The same way as before, we have a table — client_session, in which for every user_id the timestamps of activity are stored created_at. Thereafter, calculation of Sticky is quite easily performed with the following SQL-request:

SELECT daily.dau/weekly.wau*100 AS sticky
-- Calculating the average DAU for a week
( SELECT avg(dau) AS dau
(SELECT from_unixtime(cs.created_at, "yyyy-MM-dd") AS event_date,
ndv(cs.user_id) AS dau
FROM client_session cs
AND from_unixtime(cs.created_at)>=date_add(now(), -7)
AND from_unixtime(cs.created_at)<=now()
GROUP BY 1) d) daily,
-- Calculating WAU for a week
( SELECT ndv(cs.user_id) AS wau
FROM client_session cs
AND from_unixtime(cs.created_at)>=date_add(now(), -7)
AND from_unixtime(cs.created_at)<=now() ) weekly

Along with the fundamental metrics we shall address also the metrics, connected to the tools of client base retention. These can be represented by the tools of direct marketing: sms, email, push-notifications. Each tool usually has the following descriptive metrics: number of messages sent / number of messages delivered / number of users returned. They showcase the efficiency of each of the tools.


Finally, we have reached the very key metric, that represents a point of interest of all the business users – money. Revenue, profit – money that we receive from users at acquisition of our product. In absolute figures, this metric (or the result of company’s activity) is not highly indicative, however it’s important for understanding of the future trends.

Most often, the following number of relative metrics are used, that describe the behaviour of users:
ARPU ( = Revenue / Users )— average revenue per one user
cARPU( = cumulative Revenue / Users ) — cumulative average revenue per one user
ARPPU ( = Revenue / Payers ) — average revenue per paying user
Avg Receipt (= Revenue / Purchases ) — average check
LTV / CLV — aggregate revenue per one user (life value of a customer)

I am planning to dedicate a separate post to an issue of LTV, since it’s quite a wide subject. In this post we’ll figure out ARPU, cumulative ARPU and connection with LTV. The metric ARPU shows us how much we earn on a user on average for some period of time (normally, it is a week or a month). It is useful information, however it might be not enough. The task of efficient marketing is to attract such users, that bring company more money than the amount that is spent on attraction thereof. Thus, if we are modifying the indicator ARPU and review the cumulative ARPU for 30, 60, 90, 180 days, for instance, we will receive quite a good approximation to an LTV of a user. Would be even better if we build a curve of cumulative ARPU by days.

Curve of cumulative ARPU

By adding CPI as a horizontal line, we will obtain a graph that is extremely useful for understanding. In the point of two lines’ intersection we get a day, starting from which the revenue from a user becomes higher than the costs on his acquisition (user’s acquisition becomes effective). In the example that we observed above that is the 56th day of client’s life. Solution of this task is like searching for a break-even point, however we need to keep in mind that company bears also other indirect costs, that need to be considered in order to calculate the break-even point correctly.


Recommendation of products of the company to friends, relatives and acquaintances can be considered as the best scenario of interaction with clients and the highest level of reward for the company. With regard to metrics, the following can be allocated: number of activated invited new users per one client and NPS.

Number of activated referrals allows to increase CAC / CPI. For example, we attract a user for $1 and want to preserve such a tendency. We have developed mechanics of referral links and revealed that now, after implementation, an average user invites two other users. So, in this case, the cost of acquisition of a user will be equal to $1 / 3 = $0.33. Consequently, we can afford acquiring users for $3, maintaining the value of CAC that is acceptable for us.

NPS (Net Promote Score) — metric, that showcases the level of customer loyalty. The mechanics of calculation thereof is thoroughly described in Wikipedia, therefore we won’t stop on this point. Let’s just say that it is recommended to measure NPS on a regular basis, using direct marketing communication channels.

Hierarchy of metrics within an organization

We have examined crucial metrics of every stage of AARRR quite thoroughly, and now all we have left is to find out how we can structure the indicators in order to receive a perfect dashboard.

For the solution of this task, it would be wise to decompose the goals of the company and the metrics corresponding thereto into different levels. Oftentimes, each subsequent level corresponds to a company’s department and represents KPI of this department. To put it simply, we can imagine the main high-level goal of the company – Profit and decompose it into components: Revenue and Costs.

Hierarchy of metrics within an organization

A good example is a school of English – SkyEng, on the video you can get familiar to the thoroughly designed structure of metrics of SkyEng.

Another alternative might be construction of dashboard structure on the basis of AARRR framework, sorted out above. Schematically, such dashboard will look as follows:


Today we have studied the key marketing metrics, that will help to track the changes on every stage of marketing funnel and tell you about the efficiency of each stage, as well as become useful tool of any marketologist’s activity.

References to the topic:

 No comments    495   2019   analytics   Analytics engineering   longread   marketing