Real-time Tweets geolocation visualization with Elasticsearch and Kibana region map

In Kibana version 5.5 a new type of chart has been added: Region Map.
Region maps are thematic maps in which boundary vector shapes are colored using a gradient: higher intensity colors indicate larger values, and lower intensity colors indicate smaller values. These are also known as choropleth maps.

In this post we are going to see how to use the Region Map to visualize the geolocation detail of a stream of Tweets (consumed using the Twitter streaming API). Basically we will show the location (by country) of a stream of Tweets on the map (higher intensity colors indicate larger volume of Tweets).

Here you can read more about the Region Map:

I am using Elasticsearch and Kibana version 5.5 on Ubuntu 14.04 and Python 3.4.

We are going to use the Twitter streaming API to consume the public data stream flowing through Twitter (set some hashtags/keywords to filter the tweets). Given the latitude and longitude (GEOJson format) of each tweet (when available) we are going to use the Google Maps API (Geocoding) to get the country name (or code) from the latitude and longitude.
Once we identified the country (given the latitude and longitude), we are going to index the Tweet to Elasticsearch and then visualize its location using the Kibana Region Map.
For each Tweet we are interested to the country (that represents the geographic location of the Tweet as reported by the user or client application), the text (for further query) and the creation date (to filter our result).

First of all, define a new Elasticsearch mapping called tweet, within the index tweetrepository:

Notice that the country field is a keyword field type (A field to index structured content such as email addresses, hostnames, status codes, zip codes or tags). It will be used as join (between the map and the term aggregation) field for the Region Map visualization.

Using Python tweepy we are going to read the public stream of Tweets.

For each tweet we are going to use the Google API to identify the country from the GEOJson details. Once we identified the country we index the document to Elasticsearch.

This is how an indexed document looks like.

tweet_region_map_document

We are going now to create a new region map visualization.

new_region_map

In the option section of the visualization, select the Vector Map. This is the map layer that will be used. This list includes the maps that are hosted by the Elastic Maps Service as well as your self-hosted layers that are configured in the config/kibana.yml file. To learn more about how to configure Kibana to make self-hosted layers available, see the region map settings documentation.

We will use the World Country vector map. The join field is the property from the selected vector map that will be used to join on the terms in your terms-aggregation. In this example the join field is the country name (so we can match the regions of the map with our documents).

In the style section you can choose the color schema (red to green, shades of blue/green, heatmap) that will be used.
region_map_configuration

 

In the buckets section select the country field (field of our mapping). The values of this field will be used as lookup (join) on the vector map.

region_map_configuration1

 

This is how our region map looks like. The darker countries are the one with a higher number of Tweets.

region_map
I really like this new type of visualization, it easy to use and allows you to add nice visualization map (even with self-hosted layers that are configured in the config/kibana.yml file) to your Kibana dashboards.

If you use Kibana to visualize logs and if you use Logstash take a look at this plugin: GeoIP Filter. The GeoIP filter adds information about the geographical location of IP addresses, based on data from the Maxmind GeoLite2 databases (so you can use the geographical location in your region map).

5 thoughts on “Real-time Tweets geolocation visualization with Elasticsearch and Kibana region map

  1. Hello! Thank you for the interesting article. But I have a question about the line

    r = HTTP.request(‘GET’, “https://maps.googleapis.com/maps/api/geocode/json?latlng=” + str(dict_data[‘coordinates’][‘coordinates’][1]) + “,” + str(dict_data[‘coordinates’][‘coordinates’][0])) # call Google Maps Geodecode API

    According to https://developers.google.com/maps/documentation/geocoding/start#reverse there should be defined API_KEY for correct operation.
    Could you provide modified this line with included mandatory API KEY?

    Thanks!

    1. Hi Igor, thank you for your feedback! At the time of writing (August 2017) that API did not require a key (there was an IP-based quota to determine when you reach the 2,500-queries daily limit). Today an API Key is needed, so I updated the code, thank you for pointing that out.

      1. Matteo,
        Thank you! But unfortunately, I still receive only “None” for country 🙁
        What if try to use ‘location’ from tweet flow instead of ‘coordinates’? People often specify the location in their twitter profiles. After that, this location may be somehow parsed and used for https://maps.googleapis.com/maps/api/geocode/json?address=….. for extracting country code for Kibana.
        Sorry, but I’m absolutely not familiar with python. What do you think about this variant?

        1. AFIK the location is not always the best, you can type whatever you want in the location when you tweet your content.
          If you send me (zuccon.matteo@gmail.com) the details of the location (lat, long) you are tying to geodecode I will take a look at it (to see why you get None). You can reach me also on Twitter @matteo_zuccon

  2. Hi,
    I’m doing something similar and ran into a problem with the default linear color selection. My application (in one run using variable query parameters) finds tweets in 5 countries, with one country having a huge number (>500k) and 4 others (<10k). Kibana displays this map using two colors: one for the big number and one for the group of smaller numbers. How can I get a unique color for each country?

Leave a Reply

Your email address will not be published. Required fields are marked *