Kibana Tag Cloud

In the Kibana 5.1.1 version, a new type of visualization has been added: the Tag Cloud chart.
A tag cloud visualization is a visual representation of text data, typically used to visualize free form text. Tags are usually single words, and the importance of each tag is shown with font size or color.
In this post we are going to see how to use this new type of visualization. I assume you already have installed and configured Kibana and Elasticsearch.

First of all, create a new index and index some documents. I indexed a JSON file containing the entire works of Shakespeare.

Each document has the following format.

You can download it here (notice it is around 24 MB): shakespeare.json.

Create a new index.

And index the documents using the Bulk Index API.

Now, from the Kibana dashboard, select the tag cloud visualization chart.
kibana_v1

 

kibana_v2

You only need to specify the field to use to build the tag cloud. Notice that the tag cloud only supports the term aggregation.
kibana_v3

In this example I selected the speaker field. So the tag cloud will depict the main (higher count) speakers within the Shakespeare works.
You can select a bunch of other options like the tags font size and orientations.
kibana_v4

The main speakers within the the works of Shakespeare are Gloucester and Hamlet.
kibana_v6

You can save this visualization and add it to your dashboard.

The tag cloud visualization is a useful visual representation of text data, that can be used to depict keyword metadata (tags) of documents in a Elasticsearch index.

Elasticsearch and Kibana with Docker

Last weekend, in occasion of the Docker Global Mentor Week, I attended the Docker meetup in Milan. I improved my knowledge about the containers world so I decide to use Docker and Docker-Compose to ship Elasticsearch and Kibana. I already wrote some posts about Docker, you can find them here: Docker and Docker Compose and Docker Compose and Django.

I suppose you already have a basic knowledge about the main Docker commands (run, pull, etc.).

I have been using Docker version 1.12.3 and Docker-compose 1.8.1 (be sure you docker-compose version supports the version 2 of docker-compose file)
We can directly pull the images for Elasticseach and Kibana (I am using the latest version 5.0.1):

The Elasticsearch image is based on the openjdk:8-jre image, you can find the Dockerfile here: Elasticseatch 5.0.1 Dockerfile.
The Kibana image is based on the debian:jessie image, you can find the Dockerfile here: Kibana 5.0.1 Dockerfile

I defined a docker-compose.yml file to ship two containers with the previously pulled images, I exposed the default ports, 9200 for Elasticsearch and 5601 for Kibana. The environment variable defined within the Kibana service, represents the Elastichsearch url (within Docker you just need to specify the service name, it will automatically resolve it to an IP address).

With the docker-compose version 2 you do not have to specify the linking between the services, but they will be automatically placed within the same network (beside you specify a custom one).

The last version of Elasticsearch is more strict about the bootstrap checks so be sure to correctly set the vm.max_map_count and the file descriptors number (Wiki: file descriptor)

You can read more about these bootstrap checks here: Bootstrap Checks

We can now ship the two containers using docker-compose up command.

The two containers have been shipped and are running, we can reach Kibana at http://localhost:5601 and Elasticsearch at http://localhost:9200.

es_kibana_containers

So with Docker and docker-compose we can easily run Elasticseach and Kibana, focusing more on the application development instead of the environment installation.

Reporting meets Kibana

Today the Elastic team officially released Reporting. Reporting is a new product that allows you to easily generate PDFs of your Kibana searches, visualizations, and dashboards. It’s great for getting a snapshot of your data and sharing it with anyone.

The Reporting plugin adds a new interface right inside Kibana that allows you to create a report based on what you have open.
You can also use it with Watcher to trigger reports in response to an event, or simply have reports emailed on a set schedule. Reports can be automated using any tool that can make an HTTP request.

Kibana_Heart

To install Reporting:

Here you can find the official release post on the Elastic blog.

If you are looking for a third part solution, to distribuite and schedule your reports, take a look at Skedler Reporting.

Kibana Tile Map

Kibana is the tool developed by the Elastic team that allows you to explore and visualize your data stored in Elasticsearch. It provides some different type of chart and in this post we are going to see how use the tile map chart.
A tile map displays a geographic area overlaid with circles keyed to the data determined by the buckets you specify.

In this post we will use the data coming from the web site bandaultralarga. It is a web site managed by the Italian government that allows to see the strategy of distribution of the broadband connection (optical fiber 30MB or 100MB) among the Italian territory during the next years and see the current situation.

I am going to use the data of the region Lombardia. You can download the data here (OpenData IODL 2.0 license).

We will build a heat map showing the current distribution of the 100MB band around the region.

For each data instance we will consider the following fields:

  • NomeProvincia: the name of the province for a given city
  • NomeRegione: the name of the region for a given city (in this example off the cities will be in the region Lombardia)
  • NomeComune: name of the city
  • oggi_100: current percentage of the city covered by 100MB band
  • geo: latitudine and longitudine of the given city

This is an example of data instance that we are going to store on Elasticsearch:

I created a new Elasticsearch index called networkcoverage and I posted a new mapping called city:

As you can see the type of the geo field is geo_point. This data type is the one used to store geo based points.
For the geo_point data type, the following formats are supported:

Lat Lon as Properties

Lat Lon as String

Format in lat,lon.

Geohash

Lat Lon as Array

Format in [lon, lat], note, the order of lon/lat here in order to conform with GeoJSON.

I represented the latitudine and longitudine as string so “41.12,-71.34” and loaded the data instances using the bulk API.

Now that the data are stored in Elasticsearch we can open the Kibana panel and under the visualize section create a new tile map.

tile map

To add a geo coordinates buckets just select the field that contains the coordinate information (in this example the geo field).
geoField

We are interested in the current 100MB band distribution, so as metric we are going to select the max of the field oggi_100.

metric

With this visualization, we can clearly see where the band distribution is high but we do not see the points where the percentage is really low (this because the dots on the map are really small).
To avoid this we need to set some chart options.

In the option tab of the chart, change the map type to Shaded geohash grid to visualize a grid instead of circle and solve the issue previously described.

grid

We built a heat map that shows the 100MB band distribution among the region Lombardia, the darker color represents the areas where the percentage of territory covered by 100MB band band is higher.

I played a bit with Kibana and I created a dashboard where is it possible to filter the map by province or city.

This is the final result:

dashbard

We can filter the map by province (in the example I selected the Bergamo province):

dashbard1

or by city (in the example I selected the city of Milano):

dashbard3