Dec 29, 2011

5 Options to Backup Your MySQL and Keep Your Service Online

An old customer was facing a major issue. They just expanded their service from a local shop to a global "follow the sun" service, providing SaaS to worldwide customers.
Their MySQL server were the finest, and their procedures are the same, including backup procedures. Yet, concepts that matched a local shop my not fit a global firm.

Going from Local to Global
Their current procedures include both binary backup and mysqldump of the server. Although both procedures are aimed to keep the service available while performing the backup, the bottom line is that the CPU is highly utilized and the service is poor.

Backup Methods
There are two major backup methods (you can read more about them and others at Lenz Grimmer's presentation):

  1. mysqldump: This method extracts the database content into a flat file. This file includes all the DDL and DML SQL statements. Although this method is considered as a "hot backup", it tends to reduce the SLA and leave users unsatisfied. Moreover, if the database is not locked during the process, you may result an inconsistent backup.
  2. Binary files backup: This method requires stopping the MySQL server. On the other hand if you copy the binary file to a disk on the same physical server, it could be a very short downtime. In order to keep the files in a safe place, you should later on backup these copied files over the network. Please notice that this method has a another disadvantage: one of the binary files may be corrupted and you may not notice that till getting to the money time.
Where the Backup Can Be Performed?
There are three possible locations:

  1. The Master Server: The safest since we know that data was not "disappeared" during replication. The significant drawback is service downtime...
  2. The Existing Slave Server: At the risk losing information that was not replicated correctly, we get a service with no downtime. Yet, notice that if you use this server for other tasks you may affect these tasks SLA.
  3. Dedicated Slave: This can be a lightweight machine (yes, even a VPS) that can perform backups once or more times a day without causing downtime and without affecting the service at all.
The Bottom Line or How to Keep Their Service Available?
Well, in this case, I suggested the customer choosing the Dedicated Slave option that in a very small price helped him reduce the service downtime and provide great SLA to their global customers.

Keep Performing,
Moshe Kaplan

Nov 28, 2011

How to Install MySQL on an Ubuntu Based Machine

It is a common question: "How should I install MySQL on an Ubuntu based machine?"
To make it easier for you, I documented the procedure:

1. Uninstall current MySQL packages
sudo dpkg --get-selections | grep -i mysql
sudo dpkg --purge mysql-package-name

2. Download the relevant generic RPM packages from the MySQL site (server, client and devel)
wget path

3. Install alien
sudo apt-get install alien

4. Install the 3 packages using alien
sudo alien -i downloaded_package_file.rpm

Keep Performing,
Moshe Kaplan

Oct 30, 2011

Adding One More MySQL Slave to Cloud Based System (Amazon AWS)

MySQL Replication tends to be complex.

Creating a new slave to support your system expansion is not easier: coping large data files over the LAN, recovering replication, etc.

Fortunately, virtualization and SAN help us do it easier and faster.

Lucky us, Cloud Computing make it even easier.

Adding a New Slave is Easier than Ever
  1. Select an existing slave that can be reboot w/o affecting the service.
    1. Take the slave out of the production environment.
    2. Stop the MySQL slave and perform a read lock: SLAVE STOP;FLUSH TABLES WITH READ LOCK;
    3. Create an AMI based on this instance using AWS console or API.
    4. Recover the instance upon AMI creation completion and get the slave back to production: UNLOCK TABLES;SLAVE START;SHOW SLAVE STATUS\G
  2. Create a new slave based on the AMI
    1. Launch a new slave using the AWS console or API.
    2. Change the server's /etc/my.cnf file to fit a new slave instance:
      1. Change the server-id=N value to a new numeric value.
      2. Set relay-log='server-relay-bin' to the original slave server relay log name. (if you want to make it even easier, configure in the first slave's my.cnf relay-log=mysqld-relay-bin, and you'll be able to avoid this step).
    3. Restart the MySQL instance: sudo /etc/init.d/mysql restart
    5. Add this new slave to the production environment.
Bottom Line
Expanding your business today is easier w/o the need for file copy or any other tedious action.

Keep Performing,

Oct 3, 2011

MySQL CentOS and Amazon Web Services (AWS)

Many Linux experts prefer CentOS. The reason is clear, Redhat's RHEL is probably the best Linux server distribution and CentOS is very much a like (except for the yearly fees).
This distribution (and Fedora as well) is very common for MySQL installation, since there are many predefined packages for it. This is unlike other distributions such as Ubuntu.

Therefore, when you will perform your database migration to the cloud, I recommend that you will first verify your needs. The following items will help you implement your systems right:

  1. Select the Linux flavor
    1. CentOS: Detailed options are available bellow. Don't forget to verify the recent CentOS version: Usually there are two (an early major version such as 6.0 and an old advanced minor version such as 5.7) and you should select the right version for you.
    2. Fedora: Some free and good AMIs are available.
    3. RHEL: There is a premium AMI for it
    4. Ubuntu: No formal MySQL distribution exists to this flavor.
    5. Debian.
  2. Select the Storage Type. Amazon AMIs can store their information on one of two storage type:
    1. Instance storage: this is a server attached disks storage that is great from plain performance issues (no network traffic and latency for storage). Yet, it lacks the option to extend the disk seamlessly to a larger size and backup must be done through the network instead on an instant disk duplication (shorter downtime).
    2. EBS: this is Storage Area Network based disks that easily can be extended and backed up with minimal server downtime. Yet, this storage costs both in money (~$0.15/GB per month) and network latency.
  3. Select the right image for you:
    1. RightScale Image: (last update Jan 2009. If you don't use RightScale, don't forget to remove their management tools and don't forget to update your version to a recent one. RightScale has the two types of images.
    2. Amazon Linux AMI: The AWS image is based on CentOS image, yet it includes some tweaks. The plus is having an Out of the Box solution with Amazon support. The minus is that it's not a 100% RHEL compatible anymore.
  4. Install your database (MySQL probably) and web server (Apache probably) on the machine.
    1. The recommended MySQL version is MySQL 5.5 with major performance enhancement since MySQL 5.1.
Bottom Line
Now, you just need is to choose the configuration that most fit your needs.

Keep Performing,
Moshe Kaplan

Moshe Kaplan is an entrepreneur and a cloud systems consultant

Sep 29, 2011

MySQL Unit Testing and Re-factoring Verification

If your software is highly dependent on a database, data intensive processes re-factoring verification can be a pain in the a$$. Yet, you probably would like to avoid a production change that will result in a total mess in your data.

Verification and Automation is Easier w/MySQL
MySQL helps us verifying the change by using a MD5 hashing. Please notice that the following works only on the Linux platform.

MD5 Hash Based Verification Step By Step

  1. Backup the baseline table before running any process: CREATE TABLE class_baseline SELECT * FROM class;
  2. Run the original process.
  3. Copy the results of the old process to a new table : CREATE TABLE class_original SELECT * FROM class;
  4. If needed, TRUNCATE the table or restore the baseline table results TRUNCATE class; INSERT INTO class SELECT * FROM class_baseline;
  5. Run the modified process.
  6. Change the pager to MD5. The result is a single short MD5 hash instead of the regular screen output: pager md5sum -.
  7. Perform a select statement on the original table. Don't forget to drop from the SELECT statement any auto enumerator or time stamp fields that are being changed in any table modification/insert. If your process re-factoring changes the insert order, don't forget to perform an ORDER BY to avoid the case of same content/different order: SELECT class_name, class_location from class_original ORDER BY class_name, class_location;
  8. Perform the same on the modified process result: SELECT class_name, class_location FROM class ORDER BY class_name, class_location;
  9. Compare the two results. If your re-factoring did change the data itself, you should receive identical HASH results.
  10. Recover to regular screen output mode using the nopager command.
Bottom Line
Re-factoring is much simple with MD5 hashing

Keep Performing,

Aug 31, 2011

Sharding is Now COTS

I wrote here a lot in the past regarding SQL Server and MySQL sharding. I wanted to update you that ScaleBase, which is led by the industry veterans Liran Zelkha and Doron Levari announced that ScaleBase 1.0 is now available
Scalebase provides a fine solution to MySQL sharding by implementing an SQL load balancer in front of your servers. 
This product is available as a service and as a downloadable product. ScaleBase 1.0 supports:
  1. Read/Write splitting
  2. Transparent Sharding
  3. High Availability
If you need a proper solution for a large size MySQL database, yet you don't have the time or mind to implement sharding by yourselves, you should take a look at their product.

Keep Performing,
Moshe Kaplan Follow MosheKaplan on Twitter

Jul 10, 2011

MySQL Statement Profiling Is Easier Now!

If you has always wanted to know why your SQL statement runs so slow and you are a MySQL DBA, you probably used the explain command to understand the estimated execution plan.

Yet, how could you know what part of the execution will take most time when you'll process the SQL Statement? Will it be the I/O, the sorting, data sending or freeing items?

One Step Ahead
The answer is simpler than you think: MySQL Profiler, a built in command line in MySQL from version 5.0, that let us know exactly why our statement take so long.

How to Do That?
  1. Decide how many queries do you want to profile: the default number is 15 and can be changed to 1 through 100 using the profiling_history_size session variable.
  2. Enable the MySQL Profiler.
  3. Perform the query.
  4. See the exact running time.
  5. Analyze the results.
  6. Disable the MySQL Profiler.
Give Us Some Code!
mysql> SELECT * FROM buffer_log WHERE buffer_num=1 ORDER BY end_date DESC limit 100;
mysql> select sum(duration) from information_schema.profiling where query_id=1;
mysql> show profile for QUERY 1;

| Status                         | Duration |
| starting                       | 0.000022 |
| checking query cache for query | 0.000064 |
| Opening tables                 | 0.000011 |
| System lock                    | 0.000003 |
| Table lock                     | 0.000029 |
| init                           | 0.000027 |
| optimizing                     | 0.000008 |
| statistics                     | 0.020821 |
| preparing                      | 0.000023 |
| executing                      | 0.000004 |
| Sorting result                 | 0.291548 |
| Sending data                   | 0.000317 |
| end                            | 0.000004 |
| end                            | 0.000003 |
| query end                      | 0.000003 |
| storing result in query cache  | 0.000007 |
| freeing items                  | 0.000011 |
| closing tables                 | 0.000003 |
| logging slow query             | 0.000001 |
| cleaning up                    | 0.000004 |

In this case you can easily see that sorting is our major issue. If you may run an explain command you could easily see that there is no index on end_date field.

More Tips by Ben and JRRL3
  1. It does not require admin permissions.
  2. It should be run from command line (and not from external tools like phpMyAdmin).
Bottom Line
Root cause analysis is much simpler now.

Keep Performing,
Moshe Kaplan Follow MosheKaplan on Twitter

Jun 16, 2011

What should I do when MySQL storage files are getting too large?

As MySQL is the internet and cloud de-facto database, we dedicate another post regarding what should you do before getting to the money time.

Some methods including stored data compression and splitting the InnoDB storage file are covered in this post.

Compressing the File
InnoDB data compression is possible in a way that is very similar to disk compression with the pros and cons of it. This method is defined in the table declaration statement and explained in the MySQL documentation. Will it improve MySQL performance? Fortunately, the answer is yes! a recent post by MySQL@Facebook team shown a 10% boost in response time and improvement in the cache hit ratio.

Splitting the Storage File
The default InnoDB configuration is using a single file named ibdata1. This file stores both data and configuration. The bad news is that this file always expands and cannot be shrinked.
Usually people notice this fact when the database is getting too large and they have some storage or backup issues. Usually in these cases, the system is already in production and cannot suffer downtime. Therefore, you should take care of splitting this file in the early days of the system rather when you are in the money time.

What to do?
  1. Configure the MySQL to save the data in various files, one per table. Please notice that this will work only to tables that will be created from now on. Perform that by adding for following lines to the my.cnf file:
    1. [mysqld]
    2. innodb_file_per_table
  2. Stop the server for maintenance and perform one of the following three methods:
    1. Convert all InnoDB tables to MyISAM and back
    2. Export only InnoDB tables, drop them, delete ibdata1 and import InnoDB tables.
    3. Export all databases, delete ibdata1 and import everything back.
  3. I recommends you to choose option 2 and perform it according to the following procedure: export-delete-import:
    1. mysqldump to the whole database.
    2. Stop the MySQL daemon.
    3. Delete the ibdata + ilog files.
    4. Delete the database folders (all except for MySQL).
    5. Change my.cnf file if needed.
    6. Restart the MySQL daemon.
    7. Import the dumped file back to the system.
Bottom Line
Some times, it wise to perform tasks earlier in order to avoid complex issues in the future.

Keep Performing,

May 9, 2011

Monitoring and Taking Care of MySQL Replication

If you get your slave into production soon, your should make sure that you know how take care of it when something goes wrong. 

Knowing that the slave is stuck can monitored using open source tools like Nagios and its MySQL plugins. In general it is wise to have such a tool to monitor all your system and business aspect including the web server and login rates.

How do I know that my slave is stuck or not?
  1. Connect to MySQL on the slave using permitted user.
  2. Run the show slave status\G command.
  3. The result should include the following lines with a Yes value. If any of them is No, there is a problem:
Slave_IO_Running: Yes
Slave_SQL_Running: Yes

It is stuck, What should I do?
  1. Connect to the file system with a permitted user.
  2. Check the values of the replication error log and take care according to it (tail -n 50 /var/lib/mysql/
Bottom Line
Get into production!

Keep Performing,

Apr 7, 2011

my.cnf and MySQL Tuning

You have an urgent performance case in a MySQL system and you have to get several low hanging fruits fast. What should you do?

MySQL Tuner is Your First Try!
MySQL Tuner is an open source Perl script that analyzes your system and let you know regarding hot spots. You can find a nice example of its results bellow:

-------- Performance Metrics -------------------------------------------------
[--] Total buffers: 34.0M global + 2.6M per thread (1000 max threads)
[!!] Allocating > 2GB RAM on 32-bit systems can cause system instability
[!!] Maximum possible memory usage: 2.6G (16% of installed RAM)
[OK] Slow queries: 0% (141/99M)
[!!] Highest connection usage: 100%  (1001/1000)
[OK] Key buffer size / total MyISAM indexes: 8.0M/5.0G

my.cnf is the First Answer!
my.cnf, MySQL configuration file is probably the first place to check. In this file you configure the small items that make the difference between a local store webpage to the next hit in the web application. The first place to take a look at is your my.cnf file that is located at /etc/my.cnf.

Where Can I Find my.cnf Examples?
Lucky us, MySQL provides out of the box 5 my.cnf best practices for different machine size. These files are located at /usr/share/doc/mysql-server-5.X.XX/ and in includes: my-small.cnf, my-medium.cnf, my.large.cnf, my-huge.cnf and my-innodb-heavy-4G.cnf. Yet, although the naming, these configurations are targeted to small systems and none will fits even a 16GB RAM machine (How large? take a look at the comment in the head of my-huge.cnf: "This is for a large system with memory of 1G-2G").

So What Should I Do?
Go over the sample configurations and analyze what you need to optimize to meet your actual needs. Don't forget to check your current system configuration using SHOW VARIABLES before getting into the cold water. Some of the following hints will help you meet your target:
  1. key-buffer: 
    1. Check using SHOW GLOBAL STATUS that the key_reads/key_read_requests ratio is less than 0.01 to avoid disk fetch.
    2. If you fail to meet this ratio, enlarge the key-buffer parameter.
  2. sort_buffer_size: 128-512MB, will be allocated for sort by each thread.
  3. read_buffer_size: 8M, will be allocated for sequential scan by each thread.  
  4. write_buffer_size: 8M, will be allocated for plain index scans by each thread.
  5. thread_concurrency: should be equal to 2XAvailable Cores.
What Next?
Well it's time to scan the SQL statements that takes too long:
  1. Enable the slow query log in the my.cnf: log-slow-queries=/home/mysqld-slow.log 
  2. Set the minimum processing time to be included in the log: long_query_time = 1
  3. Analyze the results: tail -n 1000 /home/mysqld-slow.log
  4. Verify there is no "connection leakage": there are not too many open connections in the database. Check it using SHOW PROCESSLIST;
  5. Analyze locking using SHOW STATUS LIKE 'Table%'; and SHOW ENGINE INNODB STATUS if you use InnoDB as your database engine.
Keep Performing,

Apr 6, 2011

How to Replicate Part of the Tables in MySQL

When you build a DWH or another intensive processing on a database slave copy, you may not want to replicate the whole database. 

How can you replicate only some of the tables or even part of existing tables to a slave?
In General there are 4 ways to replicate data:

  1. Database Definition: You can define what database names will or won't be replicated. You can place all the tables that need to be replicated in one database and all the other in another one. That way you can replicate only the first database.
  2. Tables Definition: You can define what specific table names you can replicate and what not.
  3. Part of a Table: Replication by SELECT limitation is supported starting from version 5.1.21 instantly. Therefore, you can easily replicate only several columns (replication discards non existing columns if they do not exist in the slave). In earlier versions, you could perform Vertical Sharding, where you could keep some of the columns in one table, and the other in another table. Then you may replicate only one of the tables.
  4. Part of a Table: Replication by WHERE limitation is not supported instantly. Therefore, you cannot easily replicate only some of the rows from a single table. Yet, you can perform Horizontal Sharding, where you can keep rows groups in different tables. Then you may replicate only some of the tables to your slave.
When Should You Avoid Partial Replication?
If your slave is being used as a copy for DRP or HA, you may still want it to fully match your master. If the copy is done for specific processing and you want to avoid unneeded replication, this is definitely the right tool for you.

Syntax is the King
Most of the syntax is performed in the slave and it is detailed in Sunny Walia post.

Keep Performing,

Mar 28, 2011

MySQL Partitioning. One step before Sharding.

Many of you may consider Sharding to meet your large scale database demands. However, in some cases it seems too early since you may not have the needed bandwidth in the current phase or you just consider to adapt a Sharding COTS solution like Scalebase and you need a solution for mid term.

I don't have time so what should I do?
The answer is simple: you may choose MySQL built-in mechanism named MySQL Partitioning. 
This mechanism partitions your table using one of 4 different rule types. It main solution is keeping all this process hidden from the applicative DBA and the software engineer. Try one of this methods:

  1. Range Partitioning: partition by column values. Most fit for known ranges like archive by date range.
  2. List Partitioning: similar to the above based on discrete values. Most fit for archive by years.
  3. Hash Partitioning: partition by hash function defined by user. Most fit when data ranges are that are unknown, but you know the spread of them. Should be used when you partition a table according to a foreign key. Please notice that in any case, the hashed key must be included in table primary key.
  4. Key Partitioning: similar to Hash, but this time the MySQL Server takes care of the function. Most fit that the data distribution is uniform. Partition key should include the primary key.
How do I declare that?
Using DDL. just add few more lines to your CREATE TABLE statement and you will receive the partitioning. For example to split by year use the following statement:

    PARTITION p2009 VALUES IN (2009),
    PARTITION p2010 VALUES IN (2010),
    PARTITION p2011 VALUES IN (2011),
    PARTITION p2012 VALUES IN (2012)

What do I get?
Mayflower has presented very nice numbers with 200M records table partitioned to 400 parts. They reached 4000 INSERT/UPDATE and 6000 SELECT statements per second.

Bottom Line
If don't have a lot of time and you need an instant solution, go for Partitioning

Keep Performing,

Mar 22, 2011

MySQL DRP Done Right

When your systems are up and running and money is getting into the bank, it is the best time to ask yourself: What if my system fails? What if my data center (yes, even your cloud one) goes down?

The Instant DRP Solution
The instant solution is preparing a backup data center. Then you should make sure that the web and application servers will be ready to take over when needed. Yet, what about taking care of your data? How will the database be managed across two different data centers?

Many cloud, web, mobile and service providers base their systems on MySQL. Therefore, this post is dedicated to best MySQL DRP strategies..

Fail and Then Manually Recover
The most simple and easy to implement method is the Master-Slave schema. This method supports:
  1. "Master": A single database server that serves for write and read.
  2. "Slaves" One or more servers that serves for read only.
When the master fails, you can promote one of the slaves to become the master.
This method can also be used to reduce stress on the master and enable various partitioning modes.

Fail and Recover Automatically
A more complex way to implement, but faster to recover (if nothing goes wrong) is implementing the Master-Master schema. In this schema both servers serve as masters and recover when one fails. Please notice that this method is a little bit tricky and you should avoid conflicts since the conflict resolution is poor:

  1. You should need to either avoid auto increment fields or make sure that only one server is an active master at a time.
  2. You should avoid same record modified by both parties.

Do It the Cloud Way
Cloud providers propose these days cloud based database (PaaS). These solutions remove the database management headache from your task list. Amazon for example provides the Rational Database Service (Amazon RDS). RDS provides a high available managed MySQL instance that supports instant recovery to any point in time. This service is more expensive than having your own MySQL instance, but if you are a medium shop, it may be the best offer for you.

Bottom Line
Select your MySQL DRP method based on your resources, recovery needs and the effort you willing to allocate. Choose Master-Slave for novice, Master-Master for experts and the cloud way to avoid the headache.

Keep Performing,
Moshe Kaplan  Follow MosheKaplan on Twitter

P.S feel free to contact for DRP and MySQL consulting

Mar 6, 2011

Faster Website? CDN Might be the Answer!

Why Our Japanese Users are Angry?
A common question when you setup a new SaaS or just a fancy website is "Why do my remote users have hard time with my application?" while when I'm in a physical location that is close to my hosting provider I get a relatively good user experience.

It's the TCP/IP Doctor Watson
Window Size is a significant parameter in the TCP/IP stack. This parameter controls how much data can be sent by either party before receiving an ACK from the other. If your Window Size is set to 8KB and the round trip (RTT) between you and the server is 20ms, you could get 400KB/s (1sec/20ms*8KB) or ~4Mbps. 

What If You Were in the Other Side of the Globe?
Well, if your RTT is more like 250ms, you could hardly get 32KB/sec (1sec/250ms*8KB), meaning that you could download a 1MB image in 30sec... Not something to proudly report back home.

CDN. The Question is: How Can I Solve This?
CDN (Content Delivery Network) helps bringing web content closer to the end user. This way it maximizes the throughput controlled by TCP/IP Window size.

How Does CDN Work?
  1. The CDN provider places many edge servers around the world (as many as dozens of thousands of locations in some cases).
  2. The CDN provider hosts your DNS record in a DNS that supports Anycast. Anycast enables the same DNS record to point to different locations around the world based on the network topology.
  3. The end user is directed to the closest edge server based on the Anycast DNS.
  4. The edge server tries to serve the file if it is cached on it (images, JavaScript files, static HTML, Streaming Video or just a static copy of your favorite news site are good examples).
  5. If the requested file is dynamic or not cached, the edge server requests the file from the origin server using a large TCP window size. This way the 230ms RTT (250ms-20ms) is done in a much larger Window Size (say 64KB) and the throughput will be much larger.
Bottom Line
CDN can help you better your end user experience, use it when needed.

If you need assistance, don't hesitate to communicate,
Keep Performing,
Moshe KaplanFollow MosheKaplan on Twitter

Feb 28, 2011

DRP, Cloud and Some Tips

\One of the most popular use cases for cloud computing is DRP.

Why DRP?
DRP is one of the costly operations in systems life cycle. Not only you need to keep a similar instance of every server and device in your backup site. You also need to make sure that these copies remained in a similar condition to the original one, so when things will get wrong, you could restore.

What Should Have You Done in a Perfect World?
If only your system could work in an Active/Active mode so that every penny you pay will result in another action that results in your bottom line. But till you reach this target, let see how can you keep risks low, and costs affordable as well.

Few Initial Tips:

  1. Keep your data out side of the applications servers: data should be stored in the database, files should be stored in external storage (Amazon S3) or CDN.
  2. Replicate your static files (php and etc) using websync.

Keep Performing,
Moshe Kaplan

Jan 22, 2011

MySQL. An Internet Standard.

As you probably already know, last week I gave a presentation at the Database2011 conference on how MySQL become an Internet Standard?
The answer is simple: great community and good support and the proof of the pudding is in the eating: everyone uses it including Facebook, Twitter and Google (if you want to hear other opinions try follow the comments flow).

MySQL Limitiations
Yet, if you will explore the MySQL capabilities you may find out that it lucks real multi threading capabilities. its table size is limited to effective number of 50-100M rows, its SELECT performance is limited to 50 statements on a single table per second and its INSERT performance is limited to 700 INSERT statements per second (based on standard Amazon instance and InnoDB engine).
So, how can the Internet industry with dozens of thousands of SQL statements per second in a medium site use such a limited database?

The answer is simple: Use ShardingSharding is a method for slashing your tables into smaller ones and storing parts of the data in each one of them. There are several common strategies including:
  1. Vertical Sharding: Tear a table into two, store only few columns in first table and the rest in the second table.
  2. Horizontal Sharding: Place each group of rows in another table. Horizontal Sharding can be implemented using various algorithms that include static hashing, hashing with directory mapping, key based directory mapping and signature based mapping.
The main issue with Sharding is reporting. Since direct grouping cannot be supported when rows are spread in various tables, you should use Map Reduce based solutions to accomplish this task.

Emerging Products
In the last year several new COTS Sharding products were introduced to the market. These products use two mechanisms in order to get over the MySQL limitations:
  1. Gizzard and ScaleBase (founded by Industry experts Doron Levari and Liran Zelkha) use the MySQL Proxy mechanism in order to implement a load balancing and Sharding solutions in front of regular MySQL databases.
  2. Xeround and Akiban implemented a new type of storage engine based on the MySQL Storage engine API.

Will MySQL remain an Internet standard in the future?
MySQL will keep being an Internet standard, unless unexpected decisions will be taken by Oracle that owns the firm. Unwise marketing decision could harm the community and may cause a major shift to other solutions including NoSQL.

Bottom Line
If you have a great internet idea, you have all the needed tools to start.

Keep Performing,
Moshe KaplanFollow MosheKaplan on Twitter


Intense Debate Comments

Ratings and Recommendations