So while I was at the MySQL UC, The Xeround database came to my attention. It bills itself as database as a service for MySQL systems and a seamless replacement for standard MySQL.
Of course, since I am a QA Engineer, I could not resist the urge to try to break it >:) As my friend and former MySQL colleage, Kostja says, “QA Engineers are a unique breed…they like to push all the buttons” : ) I would say that the QA mindset goes a bit further than that, but it is something I will delve into in another post. I will only say that there is a reason that Microsoft recognizes QA software engineering as a distinct and specialized discipline.
So, let’s get back to Xeround. It was the first database as a service that caught my eye and I just had to test it! They are currently offering a free beta. It is remarkably easy and fast to get set up with a test database and the web-based dashboard they provide is pretty interesting and offers some good information (though some of it is confusing…more on that in a bit)
It was my intent to run a small handful of tests with the mighty, mighty randgen!
My tests were as follows:
- outer_join grammar – creates seriously nasty JOIN queries that can use up to 20 tables
- transactional grammar – we have a grammar that creates a variety of transactions. Some good, some bad, with lots of ROLLBACKs and SAVEPOINTs sprinkled in for spice.
- subqueries – the nastiest grammar I have created and as I have mentioned elsewhere, it is also part of why we are just now seeing optimizer features like index condition pushdown (ICP) being reintroduced to MySQL >: )
My thoughts were that these could be quickly executed and point out any serious problems in basic functionality. MySQL and Drizzle both use these grammars as part of their testing. Drizzle must survive these tests on every push to trunk, so these seem like reasonable stressors for a new engine >: )
It should be noted that I had to modify the test grammars to accomodate some Xeround limitations, the modified randgen branch I used is here. It can be branched via bzr branch lp:~patrick-crews/randgen/randgen_drizzle_exp
Each grammar would be run with the randgen’s –debug option. This is because the user is presented with a nice report at the end of the run which indicates: query_count:row_count (ie how many queries returned how many rows):
# 2011-04-27T20:40:18 Rows returned:
$VAR1 = {
‘ 0’ => 59,
‘ 1’ => 2,
‘ 4’ => 1,
‘ 9’ => 1,
‘ -1’ => 35,
‘>100’ => 1
};
I would use this as a comparison point against MySQL 5.1. Granted, I could use the –Validator=ResultsetComparatorSimplify option, but then I would have an actual bug report that I would feel compelled to file and this would feel less like fun and more like work ; ) However, I have been in contact with engineers from Xeround and have shared my findings with them.
For the transactional grammar, I would run the grammar on each system and then do a diff of mysqldump files from each database. As Xeround is a MySQL engine, this could cause some differences, but the data in the tables should be consistent.
Before I get into the testing results, I’ll provide some overall impressions:
As I said, the web interface is pretty nice and provides you with a lot of useful information. It allows you to easily create a new named database instance and provides you with data such as status, scale, uptime, cpu utilization, memory utilization, number of connections, ops/sec, and message count. Scale refers to the autoscale capabilities that Xeround advertises. For the beta, you are allowed to scale from 3 to 4 servers. 3 servers is considered 100%, adding the extra server (when certain user-specified CPU or Memory limits are hit) equates to 133% . Interestingly enough, I observed that there were always 6 active connections when the database was idle (probably some of the Xeround ‘secret sauce‘ working…).
The control panel also allows the user to set the CPU, memory, and connections limits that will trigger scale up (and possibly scale down). In my experiments, I never seemed to tax memory or connections, but CPU limits were hit and auto-scale did trigger, though I will admit that I didn’t observe any noticeable change in the test execution.
There are also tabs for backup (not available in the free beta, though mysqldump does work against a Xeround instance), general monitoring which provides real-time information about cpu, memory and connections, and an events (messages tab). The one thing I noted about the events tab was that I received a number of warning messages about the health of my database during times I wasn’t using it. However, it is a beta service for general evaluation and certain oddities are to be expected.
Here is what I found with my tests:
1) Xeround is a MySQL engine. They do advertise this, but the main reason I noticed that all of my created test tables were now ‘Engine=Xeround’ was that I was unable to create a varchar_1024 indexed column. Xeround is limited to 255 characters max:
# 2011-04-27T19:50:27 key (`col_char_1024_key` )) failed: 1074 Column length too big for column 'col_char_1024' (max = 255); use BLOB or TEXT instead
This limitation required modification of the randgen grammars and gendata files to limit char columns to 255. As noted above, you can find the modified version of the randgen here.
2) Tables with an ENGINE=$engine_name argument are processed without an issue (ie you should be able to use a dumpfile without problems) and are converted to Xeround tables. One thing to note is that dumpfiles *from* Xeround have ENGINE=Xeround for the CREATE TABLE statements
create table t1 (a int not null auto_increment, primary key(a)) engine=innodb;
Query OK, 0 rows affected, 2 warnings (0.702761 sec)
drizzle> show create table t1;
+-------+---------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+---------------------------------------------------------------------------------------------------------------------------------------+
| t1 | CREATE TABLE `t1` (
`a` int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`a`)
) ENGINE=Xeround DEFAULT CHARSET=utf8 COLLATE=utf8_bin |
+-------+---------------------------------------------------------------------------------------------------------------------------------------+
3) outer_join grammar:
I used the following command line:
./gentest.pl --gendata=conf/drizzle/outer_join_drizzle.zz --grammar=conf/drizzle/outer_join_drizzle.yy --queries=100 --threads=1 --dsn=dbi:mysql:host=00.00.00.00:port=9999:user=USER:password=PASSWORD:database=test --sqltrace --debug
The test is designed to generate queries with large numbers of tables (up to ~20). The test ran without much incident. The Xeround server monitor indicated that the CPU was hovering near 80% for most of the time, but again…beta test setup, so I’ll give them some leeway.
The big trouble is what follows. Remember those randgen summary reports I mentioned earlier? Below is a comparison of Xeround vs. MySQL for the same command line. The values are row_count’ => number_of_queries_returning_said_row_count. What this means is that for the same set of queries, Xeround and MySQL do not always return the same result sets. I did not note any differences in query failures, so this simply indicates that results processing is differing somewhere : ( To elaborate, Xeround had 56 queries that returned 0 rows, for the same workload, MySQL only had 39. A row count of -1 indicates that there was an error with the query, such as referencing a table or column that doesn’t exist. Somehow, Xeround hit fewer errors than MySQL, though that is also worrisome – why do they register errors differently?
Xeround:
# 2011-04-27T20:11:05 Rows returned:
$VAR1 = {
' 0' => 56,
' 1' => 16,
' 2' => 6,
' 3' => 2,
' 5' => 1,
' 6' => 1,
' 7' => 1,
' 8' => 1,
' -1' => 13,
' 10' => 2,
'>10' => 1
};
MySQL 5.1
$VAR1 = {
' 0' => 39,
' 1' => 15,
' 2' => 2,
' 3' => 2,
' 4' => 1,
' 7' => 2,
' 8' => 1,
' -1' => 32,
' 10' => 1,
'>10' => 5
};
4) transactional grammar:
I used the following command line:
./gentest.pl --gendata=conf/drizzle/translog_drizzle.zz --grammar=conf/drizzle/translog_concurrent1.yy --queries=100 --threads=1 --dsn=dbi:mysql:host=00.00.00.00:port=9999:user=USER:password=PASSWORD:database=test --sqltrace --debug
This grammar generates a variety of transactions and standalone queries. The queries generated consist of both good and invalid SQL with lots of ROLLBACK’s and SAVEPOINT’s here and there. Unfortunately, I noticed a large number of differences. We’ll start with the easiest one:
< DROP TABLE IF EXISTS `A`;
< CREATE TABLE `A` (
---
> DROP TABLE IF EXISTS `a`;
> CREATE TABLE `a` (
50c50
< ) ENGINE='InnoDB' AUTO_INCREMENT=105 COLLATE='utf8_general_ci';
---
> ) ENGINE='Xeround' COLLATE='utf8_bin';
It isn’t huge, but Xeround apparently auto-converts tables names to lower-case. The randgen attempts to create table `A`, but it is stored as table `a`. This could be an issue for some people, but Xeround does say that the beta is for people to evaluate the system’s suitability for their purposes.
The big issue is that Xeround appears to not have registered a lot of the transactions issued by the randgen. The Xeround dumpfile only contained the original 10 rows from table `a`, while the MySQL 5.1 version I ran locally had 94 rows by the end of the randgen run : (
Further research of the randgen logs indicate the following issue:
# 2011-04-27T20:06:56 Query: INSERT INTO `d` ( `col_char_10` , `col_char_10_key` , `col_char_10_not_null` , `col_char_10_not_null_key` , `col_char_255` , `col_char_255_key` , `col_char_255_not_null` , `col_char_255_not_null_key` , `col_int` , `col_int_key` , `col_int_not_null` , `col_int_not_null_key` , `col_bigint` , `col_bigint_key` , `col_bigint_not_null` , `col_bigint_not_null_key` , `col_enum` , `col_enum_key` , `col_enum_not_null` , `col_enum_not_null_key` , `col_text` , `col_text_key` , `col_text_not_null` , `col_text_not_null_key` ) SELECT `col_char_10` , `col_char_10_key` , `col_char_10_not_null` , `col_char_10_not_null_key` , `col_char_255` , `col_char_255_key` , `col_char_255_not_null` , `col_char_255_not_null_key` , `col_int` , `col_int_key` , `col_int_not_null` , `col_int_not_null_key` , `col_bigint` , `col_bigint_key` , `col_bigint_not_null` , `col_bigint_not_null_key` , `col_enum` , `col_enum_key` , `col_enum_not_null` , `col_enum_not_null_key` , `col_text` , `col_text_key` , `col_text_not_null` , `col_text_not_null_key` FROM `bb` ORDER BY `col_bigint`,`col_bigint_key`,`col_bigint_not_null`,`col_bigint_not_null_key`,`col_char_10`,`col_char_10_key`,`col_char_10_not_null`,`col_char_10_not_null_key`,`col_char_255`,`col_char_255_key`,`col_char_255_not_null`,`col_char_255_not_null_key`,`col_enum`,`col_enum_key`,`col_enum_not_null`,`col_enum_not_null_key`,`col_int`,`col_int_key`,`col_int_not_null`,`col_int_not_null_key`,`col_text`,`col_text_key`,`col_text_not_null`,`col_text_not_null_key`,`pk` LIMIT 50 /*Generated by THREAD_ID 1*/ failed: 1038 Out of sort memory; increase server sort buffer size
So, it would appear that transactions are failing for some reason or another. However, I repeat the disclaimer about this being a beta and not a production deployment. It could have something to do with the resources allocated for each beta user.
5) Subquery grammar
This was the initial test I ran, but I have saved it for last. First of all, the command line:
./gentest.pl --gendata=conf/drizzle/drizzle.zz --grammar=conf/drizzle/optimizer_subquery_drizzle.yy --queries=100 --threads=1 --dsn=dbi:mysql:host=00.00.00.00:port=9999:user=USER:password=PASSWORD:database=test --sqltrace --debug
This test generates some very nasty subquery-laded queries (see below). The first thing I noticed on the single-threaded run was that Xeround seemed to not like this query very much at all:
SELECT table2 . `col_int` AS field1 FROM ( CC AS table1 STRAIGHT_JOIN ( ( CC AS table2 STRAIGHT_JOIN CC AS table3 ON (table3 . `col_bigint_key` = table2 . `col_int_not_null_key` ) ) ) ON (table3 . `col_text_not_null_key` = table2 . `col_char_10_key` ) ) WHERE ( table1 . `col_int` NOT IN ( SELECT SUBQUERY1_t1 . `col_int_not_null_key` AS SUBQUERY1_field1 FROM ( BB AS SUBQUERY1_t1 INNER JOIN ( CC AS SUBQUERY1_t2 INNER JOIN BB AS SUBQUERY1_t3 ON (SUBQUERY1_t3 . `col_char_10_key` = SUBQUERY1_t2 . `col_char_10_key` ) ) ON (SUBQUERY1_t3 . `col_char_10_not_null_key` = SUBQUERY1_t2 . `col_char_10` ) ) WHERE SUBQUERY1_t2 . `col_bigint` != table1 . `pk` OR SUBQUERY1_t2 . `pk` >= table2 . `pk` ) ) OR ( table1 . `col_int_key` BETWEEN 48 AND ( 48 + 183 ) OR table1 . `pk` BETWEEN 48 AND ( 48 + 104 ) ) GROUP BY field1 ;
Now it is quite nasty, but standard MySQL executes it with a minimum of fuss (though it does take a moment to handle this monster as well).
The other thing is that Xeround took an exceedingly long time to execute this workload. While the other grammars executed in moderate amounts of time (my testing was from a hotel room in Santa Clara while the instance is in Chicago), the subquery test was noticeably slow. I was able to walk down to the lobby, buy something, and return to my room while it was dealing with the nasty query above : ( For some context, running the same command line on my laptop took 8 seconds, Xeround took 14 minutes, but again…beta test setup and hardware, so YMMV.
Finally, we have the dreaded row count report:
Xeround:
# 2011-04-27T20:45:19 Rows returned:
$VAR1 = {
' 0' => 59,
' 1' => 2,
' 4' => 1,
' -1' => 35,
'>10' => 1,
'>100' => 1
};
MySQL 5.1:
# 2011-04-27T20:40:18 Rows returned:
$VAR1 = {
' 0' => 59,
' 1' => 2,
' 4' => 1,
' 9' => 1,
' -1' => 35,
'>100' => 1
};
As we can see, there is 1 query out of the 100 issued where result sets differed (returning 9 rows in MySQL vs. >10 rows in Xeround).
I also tried using –threads=10 to really stress the Xeround system (I didn’t bother with MySQL, it handles 10 threads of nasty subqueries like a champ…incidentally, so does Drizzle) ; ) Xeround was able to handle the workload and did so in 27 minutes. Since single-threaded took 14 minutes, perhaps Xeround doesn’t really begin to shine until we start hitting large numbers of concurrent connections?
So what can I say from the results of these informal tests? Personally, I would hesitate to say that Xeround is a drop-in replacement. The limitations on column sizes, changes in table naming, and differing result sets are a bit worrisome. However, I will say that the Xeround engineers I met at the UC were very engaged and interested in my findings and have made significant strides in subquery processing since my initial tests. I believe that with time these issues will be fixed and that not every customer will run into them (I know I’m beating this into the ground, but I was using a beta test system). Behavior may be different on a production machine and not every MySQL user will generate such workloads and every customer should perform their own careful testing and evaluation before making any changes to their systems.
My personal interest ends here. The UC introduced me to a number of interesting new storage engines and I was mainly curious about ways of evaluating them. This was a quick and dirty bit of testing just to see if I could produce any interesting pyrotechnics ; ) Go go randgen!
I really want this picture to be shown when anyone searches for 'randgen' ; )
In all seriousness, I highly recommend adoption of the random query generator. It offers a modular and customizable system for creating evaluation tools (like result set comparison, execution time comparison, replication validation, etc, etc) and has been used in production-level testing for MySQL, MariaDB and Drizzle for some time. It also plays with Postgresql and Java DB (kind of scary that 40% of that list is owned by Oracle…), so please give it a spin and see what kinds of pretty explosions you can make…who knows, testing might actually become fun for non-QA folks >; )
Additionally, these tests only took me about half an hour to setup and execute. Granted, I have been using the tool for some time, but 30 minutes to identify a number of potential problem areas seems pretty awesome to me, but then again, I am a QA Engineer and we live for such things.