Graph of telegram channels related to analytics

Время чтения текста – 3 минуты

The authors of various Telegram-blogs often publish a selection of their favorite channels so as to share their professional choice with their audience. The idea, of course, is not new, but I decided not just to make a rating of interesting analytical telegram blogs, but to solve this problem analytically.

As part of the current course of my studies, I am learning many modern approaches to data analysis and visualization. At the very beginning of the course, there was a warm-up exercise: object-oriented programming in Python for collecting and iterative building a graph with TMDB API. Usually, this method is used to construct a connection graph of actors, where the connection is a role in the same film. However, I decided that I could apply it to another problem: building a graph of connections for the analytic community.

Since my time has been particularly limited recently, and I have already completed a similar task for the course, I decided to transfer this knowledge to someone else who is interested in analytics. Fortunately, at that moment, a candidate for the vacancy of a junior data analyst, Andrey, texted me in direct messages. He is now in the process of comprehending all the intricacies of analytics, so we agreed on an internship, in which Andrey parsed data from telegram channels.

Andrey’s main task was to collect all texts from the Internet analyst’s telegram channel, select the channels, which Aleksey Nikushin linked, collect texts from these telegram channels and links on these channels. “Link” means any mention of the channel: through @, through a link or repost. As a result of parsing, Andrey got two files: nodes and edges.
Now I will present to you the graph that I got based on this data and comment on the results.

I would like to take this opportunity to express my compliments to the karpov.courses team, as Andrey has excellent knowledge of the Python language!

As a result, the top 10 channels in terms of degree (number of connections) looks like this:

  1. Интернет-аналитика
  2. Reveal The Data
  3. Инжиниринг Данных
  4. Data Events
  5. Datalytics
  6. Чартомойка
  7. LEFT JOIN
  8. Epic Growth
  9. RTD: ссылки и репосты
  10. Дашбордец

In my opinion, it turned out extremely exciting and visually interesting, and Andrey is a great fellow! By the way, he also started his own channel ”Это разве аналитика?”, where analytics news is published.

Looking ahead, this problem has a continuation. With the help of the Markov chain, we modeled where the user ends up if he iteratively navigates through all the mentions in the channels. It turned out unexpectedly, but we will tell about it next time!

Share
Send
 388   2021   data analytics   python
Popular