How to integrate user-feedback into an improved version of Timetag

The world is constantly changing at break-neck speed, so it’s important as a software-solutions company to keep up the pace. This means that aside from creating new projects, we also have to keep our focus on existing products and make sure that they stay up-to-date and run as smoothly as possible. Recently, however, we decided to take one of our more successful products - Timetag - even further.

Timetag came about as an internal project, a much-needed answer to a growing company’s need to develop some kind of time tracking solution. Since XAOP had some bad (usability) experiences with corporate time tracking software, we were looking for a low-overhead solution, without interfering with our agile workflow.

We came up with the idea of registering our time in Google Calendar, the main advantages being:

  • most people already use this calendar every day, so no need to learn and use yet another a new tool,
  • lots of different calendar tools available - you can use what you want, BYOA “avant la lettre”,
  • a calendar UI inherently gives you a good overview of the “completeness” of your entered time,
  • it not only lets you log your time, but also plan events, such as meetings, holidays, etc., and
  • being able to share your calendar with your colleagues.

Originally intended for internal use only to manage the budgets of the projects and help the invoicing process, Timetag was quickly made publicly available for free to do its ease of use and success amongst our colleagues. Over the next few years, we continued to add small improvements to the app and build a small but loyal customer base.

The next phase started in September 2014. Because of the growing number of customers and the Timetag architecture, we were starting to hit the Google Calendar API limits. The only solution to solve this was a rather drastic one: thoroughly considering the risks, we started a full rewrite of Timetag. Then, after a couple of months and about 320 workdays of effort later, the new version was ready for prime time. Obviously we tackled the blocking backend issues, but we also implemented a completely new design, added some new features, improving user experience, and making the reporting much more powerful.

Activity graph of the Timetag rewrite project (source: XAOP Timetag reports)

Finally, we also took the decision to add two paying models with additional features and integrated the Stripe payment platform. Timetag remains free for personal use but we ask a small fee per user when using it as a team. Reductions on the standard price and free trials are available, so don’t hesitate to check these out. This past year we’ve also built an API for Timetag, which allows developers to reuse the raw data and automate updates, and an iOS app, an easy-to-use alternative for adding information (without the need to know your tags) to Timetag that will be stored in your calendar.

Still, there is always room for improvement. What’s most important to us is how our users feel about using the application:

  • Do they find it easy to use?
  • Do they like the design?
  • Is there something that we could add to the application to make it easier to use?
  • Are they still using Timetag regularly?
  • What could motivate the people who stopped using Timetag/use it infrequently to use it more often?

To find the answers to these and other questions, we launched a Timetag Feedback Campaign, where users were asked to fill out a short survey regarding their user-experience with the application. The response we received from our users was overwhelming - filled with useful suggestions, constructive criticism and grateful appraisal. We’re very pleased and appreciate the time everyone took who filled out the survey to help us out.

Now it’s our turn to process, analyze and integrate these suggestions into the new release of Timetag. We’re excited to get started. Stay tuned for our next update!

View the comments

Real-time data processing - Apache Flink and Amazon EMR

In our previous blog post, we talked about serverless architectures and different interpretations of the term. On one end of the spectrum, we had Amazon RDS (Relational Database Service) and on the other AWS Lambda. This post puts the spotlight on Amazon EMR (Elastic MapReduce) and how we took advantage of its simplicity to both accelerate and lower the cost of development. Amazon EMR is also serverless in the sense of Amazon RDS: software installation and system maintenance are of no concern, while the concept of a server machine remains a basic building block. We will also come around to an earlier post on scalability and how horizontal scaling benefits both cost and performance.

Recently we were working on a project that required flexible horizontally scalable data processing. While we were considering the usual frameworks like Apache Spark, newcomer Flink caught our eye. After due deliberation, we decided to go with Flink as its feature-set seemed to not only meet our requirements, but also meshes well with our usual technology stack.

Flink is a streaming dataflow engine developed by the Apache Foundation. At a glance, it’s very similar to the more well-known Apache Spark, and their API’s are quite similar. The fundamental difference between the two is that while Spark is a batch-processing engine with support for realtime streams through the processing of small batches of data, Flink was built from the ground up as a native streaming dataflow engine, where data is immediately pushed through the pipeline as soon as it arrives. Batch processing is also supported and built on top of the streaming framework. This makes Flink’s architecture a sort of middle ground between Spark’s ease of use and Storm’s high performance stream processing capabilities.

At XAOP, we have used Flink to develop a highly distributed pipeline for the calculation of pairwise protein sequence alignments. We chose Flink because it is an interesting new technology, which has recently been gaining traction. Its first fully stable version 1.0 was released earlier this year (March 2016). It provided a fitting solution to our problem and integrates seamlessly with most Amazon Web Services. Since Flink can run on a Hadoop YARN cluster, it’s also possible to run it on Amazon EMR, allowing us to minimize the deployment effort and fully virtualize our processing resources. This approach also significantly reduces EC2 (Elastic Compute Cloud) costs, since most of the calculations can be offloaded to much cheaper EC2 spot instances.

In one of our earlier posts, we mentioned that the cost difference between vertical and horizontal scaling for on-demand instances is negligible. However, the same principle does not hold when using spot instances. While bare EC2 spot instances induce a considerable maintenance effort, using spot instances with Amazon EMR is exceedingly simple. For our EMR cluster, we need a master and a core instance to meet the minimum cluster requirements, but after that we can easily scale horizontally using very cheap spot instances straight from the AWS console (we don’t need HDFS persistence for our calculations). Unfortunately, as of the writing of this post, this does require a restart of the Flink cluster as well. For us this is currently not a deal-breaker because the spot market price for the instances we are using does not fluctuate heavily. However, if you want to make proper use of the flexibility of EC2 spot instances, this is definitely a missing feature. YARN hosted autoscaling is listed on Flink’s roadmap for 2016, so we do expect this problem to be addressed by the Flink development team relatively soon.

All in all, our first experiments with Flink have been a positive experience. It was relatively easy to integrate in our application, using Amazon S3 hosted datasets was painless and the option to deploy on Amazon EMR reduces the time normally spent configuring a Hadoop cluster. We are certainly looking forward to using it again in future projects.

View the comments

Serverless Architectures - AWS Lambda

This post is the third in a series of reports on our trip to and attendance of the AWS summit this May. If you want to start reading this series from the beginning, scroll down for the first part or check out this link.


To the more technically inclined, serverless is a somewhat silly term. Indeed, servers are still running the online scene, albeit a little more discretely. Moreover, platforms as a service were a thing long before they were called serverless. Sometimes, renaming something can make it sound a lot more sexy though.

It is, however, not an entirely nonsensical term either: it stresses the fact that you as a client are not directly confronted with the traditional shape and form of a server. The service provider maintains, supports and likely even scales the platform for you, so you only have to think about what you are going to use the platform for.

For example, configuring and maintaining a full auto-scaling cluster of virtual private servers most likely sounds like too much work and tangent to one’s (real) purposes. We can certainly do better. Often server setup and maintenance is pure overhead and cloud platforms offer an increasing number of managed appliances.

An early example could be Amazon RDS - Relational Database Service - where the provider takes (partial) responsibility for replication, maintenance and updates, but there is still a clear connection to the concept of a server and scaling is not really done automatically.

A more recent example is AWS Lambda, which is a service where you simply upload code that can subsequently be executed just like that: no servers to launch, no software to install. It provides abstraction at the function level and not at the service level (cfr. RDS). Scaling comes completely free of charge (figuratively speaking): multiple invocations of the same function will result in multiple instances of the same function running at the same time. Moreover, when your functions are not being used, you are not paying: auto-scaling out of the box with no additional configuration.

By itself, having code in the cloud that can be executed by the push of a button might be convenient, but not very useful just by itself. That is to say, Lambda functions are building blocks. They integrate rather nicely with multiple AWS services.

For example, Lambda functions can be triggered when a file is uploaded to S3, the storage service of Amazon. Images might be uploaded in full resolution, and a Lambda function subsequently generates a number of lower resolution versions for different purposes. Another example is the integration with Amazon API Gateway, which can link API URLs to specific Lambda functions. At XAOP we used this technology to implement a complete REST API backed by Lambda functions to implement logic and Amazon Elasticsearch Service to provide data. (We are currently working on a case study that will be published on our website.)

Many more examples can be conceived but it should be clear that this abstraction at the function level enables and facilitates micro-service oriented architectures, especially in combination with the API Gateway.

At the summit, we attend an interesting presentation on security in serverless applications. Something other than best practices and the importance of a secure implementation of security and access control caught our attention: wrapping the AWS API using API Gateway.

On face value, wrapping one REST API with another seems like a silly thing to do. When looked at as an extra layer of control, however, the possibilities become more clear. For example, the presenter talks specifically about Amazon CloudTrail, an auditing tool that can be configured to deliver AWS API call logs to S3 buckets. The service supports most but not all calls and support for new services typically comes after they have been introduced to the general public. If you want the logging features right here and right now, AWS Lambda and API Gateway can easily be combined into your custom auditing solution.

View the comments

Cost Optimization on AWS

This post is the second in a series of reports on our trip to and attendance of the AWS summit this May. If you want to start reading this series from the beginning, scroll down for the first part or check out this link.

We rise early on Tuesday morning to make the trip from our hotel to the AWS summit venue, the NBC congress centrum. It’s built to impress with its modern structure and vast amount of glass windows. The days starts off with a partner network breakfast where we attend a number of short presentations on what constitutes being a partner and the benefits thereof. Afterwards we head on over to the key presentations, led by Dr. Werner Vogels, CTO of Amazon.com, the first dealing with the subject of cost optimisation.

Relevant to any and all customers of a cloud offering: what is the Total Cost of Ownership? The answer to this question is a bit different depending on what your current infrastructure is. If you are a startup, it becomes a tradeoff between a high investment in on-premises hardware versus starting small in the cloud and letting your environment grow with you dynamically. If you are a company that has been around for a while, you might want to consider moving to the cloud; even though you’ve already made the initial investments in on-premises hardware, it’s likely that moving to the cloud will be more cost-effective in the long run. Obtaining hard numbers is easier than one might suspect; AWS created a TCO calculator that will do the number-crunching for you: https://awstcocalculator.com/.

The benefits of a dynamic and flexible on-demand environment are quite appealing to the start-up and we’re sure that, after taking a look at the financial aspects, you’ll be hard-pressed to find reasons to opt for on-premises infrastructure, or keep up long-term maintenance for your on-premise or colocated infrastructure.

Spending money is very straightforward and it remains important to tread lightly when using the AWS cloud to make sure costs are not higher than they should be. The basics remain important: servers should only be as large as their workload warrants and permanent instances should be reserved. Having your server be just the right size and reserving it for a fixed term does beg the question: what if demand rises? The solution to this problem lies in the elasticity of your infrastructure. Having your capacity scale on-demand, while provisioning a minimal capacity through reserved instances helps optimizing for cost without sacrificing performance.

Server scaling can be divided in two main strategies: so-called vertical and horizontal scaling. Vertical scaling is where the capacity of a single machine is increased, e.g. increase RAM capacity or number of CPU cores. With horizontal scaling, a load balancer distributes traffic across multiple instances of (typically) the same server type.

On AWS, the difference in cost between the two strategies is negligible: an instance twice as big is also twice as expensive. The main advantage of vertical scaling is that it is typically the most straightforward way to increase performance with little or no impact on the software architecture. Conversely, horizontal scaling is very dynamic as instances can be added or removed in less than a minute, but this more volatile infrastructure has, of course, implications on the software side.

It is important to note that these tools and techniques are mainly optimization techniques: adapting your software architecture to the cloud and to horizontal scaling techniques can be worth the investment. For example, we helped one of our clients save €4000 on their monthly AWS bill by changing the software architecture and introducing a horizontal scaling approach.

View the comments
View the Archive