Kobas tames technology for hospitality management. Our Tech Talk articles focus on the technology behind Kobas.
Why AWS Aurora Doesn’t Work For Kobas
Autumn is the time of year where we proactively review our infrastructure to make sure it is ready for the inevitable rise in demand in December. Not only do our clients enjoy elevated trade in December due to the Christmas season, which in turn generates more load on our systems, but there’s also peak demand on Mondays in December to analyse stock usage, and of course to get hours approved and payrolls run ahead of the Christmas break itself.
Usually we plan to add additional database servers or increase the hardware resource behind those virtual machines, but this year we decided to first review Amazon’s own database service, Aurora. This statement from Amazon’s site both explains what Aurora is and why we were so interested:
“Amazon Aurora is a relational database engine that combines the speed and reliability of high-end commercial databases with the simplicity and cost-effectiveness of open source databases. It delivers up to five times the throughput of standard MySQL”.
Source: Amazon Aurora Product Details
So Aurora claims to be both directly compatible with MySQL, and offers up to five times the throughput. Very interesting!
INSERT speed tests
The first step to put these claims to the test was to spin up an Aurora instance. Rather than replicating the more complex infrastructure that runs our production service, we choose to replicate our training service; a mirrored, playground version of Kobas Cloud that is reset weekly, allowing our clients to train users on various Kobas Cloud tasks in a safe environment.
As our training platform doesn’t receive any intensive usage, the whole thing currently runs on a t2.medium MariaDB 10.1.23 instance. This means the virtual machine has access to 2 vCPUs and 4 GB RAM. It runs in a single availability zone (AZ) as it isn’t operationally critical.
Every weekend then, each client account on training is refreshed from a backup – a simple MySQL dump import. So the first step was to import some client accounts into an Aurora instance of the same size, a t2.medium, with a matching parameter group.
I was indeed immediately struck by the performance difference – but not in a good way. Aurora was notably slower. How much slower? We started with an SQL backup of a relatively small account. Total SQL size was 224M – across 286 tables. Let’s compare the import times.
Instance Type | Import 1 (m:s) | Import 2 (m:s) | Import 3 (m:s) | Average (m:s) |
---|---|---|---|---|
MariaDB 10.1.23 t2.medium | 1:15 | 1:04 | 1:10 | 1:10 |
Aurora t2.medium | 3:28 | 3:06 | 3:20 | 3:18 |
The results are clear: under these circumstances Aurora is nearly three times slower than MariaDB. The test was even skewed in Aurora’s favour, as a bunch of other processes were operating on our MariaDB instance while these benchmarks were taking place, whereas the test Aurora instance had literally nothing else to do.
Now while bulk importing may not be representative of usual database use, Kobas Cloud is a fairly write heavy application, so this matters. Every action that takes place on any of the hundreds of tills we have in daily use is written to our database. User logs in to till: new row. Customer buys a burrito and a beer: new row for the burrito, new row for the beer, new row for the order, new row for the payment, new row for the beer usage, the tortilla wrap, each filling – you get the picture. Lots of writes. The database needs to be quick!
The situation actually worsened as load increased. As mentioned, every weekend all client training accounts are refreshed from their live counterparts. This process is supposed to happen overnight, and with various shards updating concurrently, takes about 3 hours into the MariaDB training instance. It was taking very nearly 24 hours into the Aurora instance. Nearly eight times slower!
More general use
Once these imports were complete we were in a position to A/B test the application on the two database back ends. Moving around the application, generating some meaty reports and generally, deliberately giving the application a hard time, the results were mixed. Some reports were a little more nimble on Aurora, but other actions which required some write activity were more sluggish.
MySQL Slap
We used the industry standard MySQL Slap tool with the following flags:
/bin/mysqlslap --auto-generate-sql --verbose --concurrency=50 --iterations=100 --number-int-cols=5 --number-char-cols=20
Results for Aurora
Benchmark Benchmark Average number of seconds to run all queries: 3.297 seconds Minimum number of seconds to run all queries: 1.609 seconds Maximum number of seconds to run all queries: 7.172 seconds Number of clients running queries: 50
Results for MariaDB 10.1.23
Benchmark Average number of seconds to run all queries: 1.506 seconds Minimum number of seconds to run all queries: 1.356 seconds Maximum number of seconds to run all queries: 1.805 seconds Number of clients running queries: 50
On average then, Aurora takes more than twice as long as MariaDB 10.1.23.
“Up to 5 times the throughput of standard MySQL”
We spent hours researching various parameters and looking for other reports on the Internet. There had to be a reason why we were actually seeing worse performance. It becomes clear when investigating this speed claim further:
“Testing on standard benchmarks such as SysBench have shown up to a 5x increase in throughput performance over stock MySQL 5.6 on similar hardware.”
Source: Amazon Aurora Product Details: MySQL-Compatible Edition
MariaDB 10.1 routinely benchmarks faster than MySQL 5.6. So we are already enjoying this advantage. Aurora certainly has some very impressive replication speed, and its MySQL compatibility is good enough for our application use, but this isn’t enough to make a switch over anything but a regressive step in performance terms.
Conclusions
Overall then, we didn’t feel that our users would benefit directly or indirectly from a move to Aurora. Directly the application was roughly the same. Our API would certainly have struggled with the reduction in write speed, which would have indirectly lead to some of our services appearing to be sluggish.
Internally it would have given our team a couple of headaches in addition to the poor insert performance, and when the team has these kind of issues, it also indirectly affects our users as our features take longer to develop.
The first headache would be the loss of MariaDB 10.1 syntax and features. An example of this is that the ability to make schema declarations such as ALTER TABLE foo ADD COLUMN IF NOT EXISTS bar... is important if we want to create idempotent update packs. This kind of query fails on Aurora because it is only MySQL 5.6 compatible.
The AWS team have stated they are working on 5.7 compatibility for Aurora, but that’s been the case for over a year. By the time that does appear, MySQL 8.0 could well be out.
This leads nicely into our second headache: compatibility with our EPoS servers, and our engineering environments. Right now, Kobas Cloud, Kobas EPoS and all our engineering and QA rigs run MariaDB 10.1 as their database service. It’s all the same. This is super useful. Tools can be used on all platforms, and also our engineers only need to operate with a single SQL toolkit in their brains.
Were we to change to Aurora, which is effectively MySQL 5.6, in order to achieve the same compatibility we would have to downgrade all these other environments. We don’t want to do that. In fact, we want to upgrade everything to MariaDB 10.2, but we have chosen not to do that until AWS make MariaDB 10.2 available in RDS. The ubiquity of a single SQL version is really important to us, and this exploration into Aurora has really helped us crystalise that view.
There’s no doubt that if we were to re-engineer our application to make use of Aurora features, it would be possible to improve our performance. In particular, we could use the impressive replication speed to divert many report reads to replicas, but overall the engineering time and cost required for this makes little sense. It is far more effective for us to spend a few hundred more dollars a month on more powerful MariaDB infrastructure, as the decision to use that technology is already paying dividends.