Blog: Fast data comparison with Database Compare Suite

We continue our blog post series that cover the key Database Compare Suite features. In our previous blog posts, we talked about validating data migration and discovering exact data differences in selected tables. Now, we want to talk about comparing huge data assets. For example, when it comes to comparing tables with millions/billions of rows, the operations may last for minutes and even hours. Of course, this is unacceptable, especially when we talk about the downtime of the production environment.

So, we want to show how you can use Database Compare Suite to quickly verify migration of extremely huge tables. Check out our video to see how you can do this with the fast data comparison operation in Database Compare Suite.

As you can see, Database Compare Suite allows for verifying data in 2 tables with million rows in less than a minute. Using the fast data comparison operation, you can quickly validate the results of your database migration.

Background on the fast data comparison operation

Let us discover how the fast data comparison operation works and why it is really fast.

First of all, Database Compare Suite compares the number of rows in the tables. If the row count is different, obviously, the tables are not equal. So, you can opt for simply comparing the row count in the tables on the fast data comparison options page. You can use this as the initial check when you need to validate data migration.

After that Database Compare Suite calculates the hash values for both tables. If you compare tables from the same database platform, we use the built-in hash function. It returns the result amazingly fast. However, if you compare tables from different database platforms, Database Compare Suite calculates hash values for each column of the table. In this case, you can even identify different columns.

Finally, you should consider the following downside of the fast data comparison operation in Database Compare Suite. As this operation compares hash values rather than actual data, there is a probability of a false-positive result. This probability is very low, however, it exists as the hash values for different data sets can be equal in theory. This means that even if the fast data comparison operation says the tables are equal, they still may be different. However, you still may use this operation for fast checks. And if the fast data comparison operation says that the tables are different, they really are not equal.

In our next Database Compare Suite blog post, we will demonstrate how you can automate DevOps operations with Database Compare Suite.

Be sure to download a free version of this ultimate tool for database developers and administrators.