Python Fire: a library for creating command line interfaces (CLIs) from absolutely any Python object

Few days ago Google open-sourced Fire, a Python library to generate command line interfaces (CLIs) from any Python code. Simply call the Fire function in any Python program to automatically turn that program into a CLI.
You can install Fire using pip:

Fire is easy to use, take a look to this example:

From the command line we can now run:

When you create the Fire object you can specify the name of the command as entered at the command line.

Example usage:

I am already using Fire for a small project, it is very useful, it works well and it saves your time (you do not need to write code to parse input args).

Here you can find some useful resources and examples:

Post Docker containers statistics to Slack

In this post we are going to see how to monitor Docker containers resource usage statistics and send alarm notifications to a Slack channel.
The Docker Engine allows you to see these statistics running the docker stats command. It returns a live data stream for running containers.

Here you can find the official Docker documentation of the command: Docker stats
If you are wondering what Slack is, let me just say that it is an instant messaging and collaboration system based on channels.
You can read more here: Slack

We are going to monitor the container’s resources using a Python script. There are a lot of container management systems but I found some of them too complicated or not very useful. If you want something light, easy and open-source I suggest you Portainer.io (I am going to write a post about it).

We are going to use the docker-py Python client library to connect to the Docker Remote API.
Here you can find the library Github repository: docker-py.
If you do not want to use Python, here you can find a list of client libraries for other programming languages: Docker Remote API client libraries

Install the library with pip:

Connect to the Docker Deamon and to the Docker Remote API (specify the Docker server address):

For each running containers we can now stream the resource statistics.
To list all the running containers use the .containers.list() method.

To stream the statistics for a given container use the client.stats method. It takes the container name as argument and returns a generator (Wiki Python Generator).

Here you can find the official documentation of Low-level API: docker-py low-level API.
The stats method returns a JSON with the following format (note the CPU and memory usage information):

We are going now to analyze the resource usage statics and eventually notify an alarm message to Slack (if the usage of some resources exceed our thresholds).
I am not going to post here how to extract the resource usage from the JSON object, but you can find the full code in this Github repository: mz1991/docker-stats-slack.

To send the notification to Slack we will use the Slack Webhook integration. Webhooks are a simple way to post messages from external sources into Slack. They make use of normal HTTP requests with a JSON payload that includes the message text and some options.
You can read more here: Slack Incoming Webhooks.
I assume you configured the Slack Incoming Webhook integration for your Slack team and you have the Webhook URL.
To configure the Incoming Webhook integration for your Slack team, you can use the following URL: https://[your_slack_team].slack.com/apps/A0F7XDUAZ-incoming-webhooks

With the Webhook integration we do not need any Slack library, to post a message to the Slack channel we just need to post (HTTP post) a message to the Webhook URL endpoint.

The format of the JSON we are going to post is the following:

You need to specify the channel where you want to post the message, the display name of the user with an emoji and the text of the message.

You can find the Webhook documentation here: https://api.slack.com/incoming-webhooks
and the list of available emoji here: Slack Emoji Cheat Sheet

To post the message we are going to use the Request class (built-in in the Python urllib3 module).

The posted message will look like this:

message_to_slack
We saw how to stream the containers statistics and how to post an alarm message to a Slack channel.
I build a Python script that uses a set of environment variables for the Slack channel configuration and the resource usage thresholds.

These are the environment variables needed:

  • SLACK_WEBHOOK_URL: the webhook url for your Slack team
  • SLACK_CHANNEL: the channel id (where the message will be posted)
  • SLACK_USERNAME: the username for the incoming messages
  • SLACK_EMOJI: the emoji for the incoming messages
  • MEMORY_PERCENTAGE: maximum percentage of RAM memory used for each container. When the percentage of used memory will exceed this threshold, an alarm will be posted to the Slack channel
  • CPU_PERCENTAGE: maximum percentage CPU usage for each container. When the CPU percentage usage will exceed this threshold, an alarm will be posted to the Slack channel
  • SLEEP_TIME (seconds) : interval between each message posted to Slack. Number of seconds between messages, unique for container.

To run the script use the following commands:

All the code from this post can be found in this Github repository: mz1991/docker-stats-slack.
Feel free to fork it and to add further features.

Twitter sentiment analysis with Amazon Rekognition

During the AWS re:Invent event, a new service for image analysis has been presented: Amazon Rekognition. Amazon Rekognition is a service that makes it easy to add image analysis to your applications. With Rekognition, you can detect objects, scenes, and faces in images. You can also search and compare faces. Rekognition’s API enables you to quickly add sophisticated deep learning-based visual search and image classification to your applications.
You can read more here: Amazon Rekognition and here: AWS Blog.

In this post we are going to see how to use the AWS Rekognition service to analyze some Tweets (in particular the media attached to the Tweet with the Twitter Photo Upload feature) and extract sentiment information. This task is also called emotion detection and sentiment analysis of images.
The idea is to track the sentiments (emotions, feelings) of people who are posting Tweets (with media) about a given topic (defined by a set of keywords or hashtags).

Given a set of filters (keywords or hashtags) we will use the Twitter Streaming API to get some real-time Tweets and we will analyze the media (images) within the Tweet.
AWS Rekognition gives us information about the faces in the picture, such has the emotions identified, the gender of the subjects, the presence of a smile and so on.

For this example we are going to use Python3.4 and the official AWS Python SDK Boto3.

First, define a new client connection to the AWS Rekognition service.

I assume you know how to work with the Twitter Streaming API, I reported an example, by the way you can find the code of this project in this Github repository: tweet-sentiment-aws-reko.
Example of Tweet dowload using the Streaming API and Python Tweepy:

For each Tweet downloaded from the stream (we will discard the Tweet without media), we analyze the image and extract the sentiment information.
To detect faces we use the detect_faces method. It takes the image blob (or the link to an image stored within S3) and returns the details about the image.

To check if at least one face has been found within the image, check if the key ‘FaceDetails‘ exists in the response dictionary.
We can now loop through the identified details and store the results.

We can now print the results to understand the sentiments within the analyzed images, in particular:

  • How many smiling faces there were in pictures?
  • How many men/women?
  • Which emotions were detected? With which confidence? (in average)

This is an example of output (analysis performed on a set of 20 images).
result

We can use this simple system to track and analyze the sentiment within the photos posted on Twitter. Given a set of posted pictures we can understand if the people in the pictures are happy, are male/female and which kind of emotions are feeling (this can be very useful to understand the reputation of a brand/product).

Note that AWS Rekognition gives also information about the presence of beard, mustaches and sunglasses and can compute the similarity between two faces (this can be useful to understand if there are more pictures representing the same people).

I would like to improve the system, would be interesting to analyze also the text of each Tweet to see if is there a correlation (and how strong) between the sentiment of the text and the sentiment of the image. If you want to contribute do not hesitate to contact me (would be cool also to build a front-end to see the downloaded medias and the results in a nicer way).

Boto3 AWS Rekognition official documentation: Rekognition
Github repository with the full code: tweet-sentiment-aws-reko.

City Bikes Telegram Bot

Release day! Today we are launching a new Telegram Bot, Leonardo City Bikes Bot.
“Bots are simply Telegram accounts operated by software – not people – and they’ll often have AI features. They can do anything – teach, play, search, broadcast, remind, connect, integrate with other services, or even pass commands to the Internet of Things.” Read about the Telegram Bots here: Telegram Bot platform.

So, what can Leonardo do for you? He can help you to find the closest bike stations around you. Just share with him your GPS location or simply type and address or a place of interest and he will give you helpful information about the closest bike stations.

bot1

You can choose the distance within the stations need to be. Leonardo gives you the name of the stations, the location (so you can easily navigate) and useful information about the free bikes and empty slots.

bot2

To start to chat with the bot, reach him here: https://telegram.me/citybikesbot or search @CityBikesbot in the Telegram search bar.

Leonardo can speak the following languages:

  • Italian
  • English
  • Norwegian
  • Polish
  • German
  • Russian
  • Spanish
  • French

Leonardo integrates with a lot of rent-bikes network around the world, mainly in Europe. Take a look to this web site to search for your location: CityBikes.

The Bot has been built with Python (using the library twx.botapi and is running in the cloud within the Amazon AWS platform.
MongoDB and Redis have been used to store some information about the bike stations and cache some contents to make the user experience smoother. The application has been built and shipped using Docker.

The brain of Leonardo is powered by the CityBikes API.
Read more about the project: City Bikes API

Feel free to use the bot and share your feedback with us.
If you want to contribute to the project to extend the features, add further integration or new translations, do not hesitate to contact me.

The version for Facebook Messenger is under construction.

Machine learning with Tensorflow and Elasticsearch

In this post we are going to see how to build a machine learning system to perform the image recognition task. The image recognition is the process of identifying and detecting an object or a feature in a digital image or video. The tools that we will use are the following:

  • Amazon S3 bucket
  • Amazon Simple Queue Service
  • Google TensorFlow machine learning library
  • Elasticsearch

The idea is to build a system that will process the image recognition task against some images stored in a S3 bucket and will index the results to Elasticsearch.
The library used for the image recognition task is TensorFlow.
TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API. TensorFlow was originally developed by researchers and engineers working on the Google Brain Team within Google’s Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well. You can read more about it here.

These are the main steps performed in the process:

  • Upload image to S3 bucket
  • Event notification from S3 to a SQS queue
  • Event consumed by a consumer
  • Image recognition on the image by TensorFlow
  • The result of the classification is indexed in Elasticsearch
  • Search in Elasticsearch by tags

This image shows the main steps of the process:

flowimgjpg

 

Event notifications

When an image is uploaded to the S3 bucket a message will be stored to a Amazon SQS queue. To configure the S3 Bucket and to read the queue programmatically you can read my previous post:
Amazon S3 event notifications to SQS

Consume messages from Amazon SQS queue

Now that the S3 bucket is configured, when an image is uploaded to the bucket an event will be notified and stored to the SQS queue. We are going to build a consumer to read this notification, download the image from the S3 bucket and perform the image classification using Tensorflow.

With this code you can read the messages from a SQS queue and download the image from the S3 bucket and store it locally (ready for the image classification task):

Image recognition task

Now that the image (originally uploaded to S3) has been downloaded we can use Tensorflow to run the image recognition task.
The model used by Tensorflow for the image recognition task is the Inception-V3. It achieved a 3.46% error rate in the ImageNet competition. You can read more about it here: Inception-V3 and here: Tensorflow image recognition.

I used the Tensorflow Python API, you can install it using Pip:

You can find all the information about Setup and Install here: Download and Setup Tensorflow.Here you can find an official code lab by Google:  Tensorflow for poets.

So, starting from the classify_image.py code (you can find it on Github: classify_image.py) I created a Python module that given the local path of an image (the one previously downloaded from S3) returns a dictionary with the result of the classification.
The result of the classification consists of a set of tags (the objects recognized in the image) and scores (the score represents the probability of a correct classification. The scores sum to one).

So, calling the function run_image_recognition with the image path as argument, will return a dictionary with the result of the classification.

In the previously shown code, the Tensorflow built-in functions definition are not reported (you can find them in the Github repository I linked).
The first time you will run the image classification task, the model (Inception-V3) will be downloaded and stored to your file system (it is around 300MB)

Index to Elasticsearch

So given an image we have now a set of tags that classify our image. We want now to index these tags to Elasticsearch. To do that I created a new index called imagerepository and a new type called image.

The image type we are going to create will have the following properties:

  • title: the title of the image
  • s3_location: the link to the S3 resource
  • tags: field that will contain the result of the classification task

For the tags property I used the Nested datatype. It allows arrays of objects to be indexed and queried independently of each other.
You can read more about it here:
Nested datatype
Nested query

We will not store the image to Elasticsearch but just the URL of the image within the S3 bucket.

New Index:

New Type:

You can now try to post a test document:

We can index a new document using the Elasitcsearch Python SDK.

Search

Now that we indexed our documents in Elasticsearch we can search for them.
This is an example of queries we can run:

  • Give me all the images that represent this object (searching by tag = object_name)
  • What does this image (give the title) represent?
  • Give me all the images that represent this object with at least 90% of probability (search by tag = object_name and score >= 0.9)

I wrote some Sense queries.

Images that represent a waterfall:

Images that represent a pizza with at least 90% of probability:

In this post we have seen how to combine the powerful machine learning library Tensorflow to perform a image recognition task and the search power of Elasticsearch to index the image classification results. The process pipeline includes also a S3 bucket (where the images are stored) and a SQS Queue used to receive event notifications when a new image is stored to S3 (and it is ready for the image classification task).

I ran this demo using the following environment configuration:

  • Elasticsearch 5.0.0
  • Python 3.4
  • tensorflow-0.11.0rc2
  • Ubuntu 14.04