call us toll-free +1 855 855 3600
 
DB Best Chronicles
 
Talks on Big Data, Mobile Apps, Web and Software Development

How Fast is FAST?

Posted by | On May 25th, 2017 | In Query Tuning, SQL Server
 

When listening to an online presentation a few days ago, the speaker suggested using one of my favorite, underused query hints: FAST N.  This replaces an older table hint called FASTFIRSTROW, which is one of the few deprecated features to actually be removed from SQL Server. The speaker suggested using the FAST 1 hint, because ‘sometimes it improves the query’, but didn’t say any more about how or why it would improve the query, and why you might NOT want to use it. So, I’m going to tell you what this hint does.

When I first heard about the FASTFIRSTROW hint, years ago, it was described as a way to force SQL Server to use a nonclustered index to avoid a sort. It, and its descendent FAST N, can certainly have that result but that’s not what they actually “do”.  I going to assume you’re relatively familiar with the way SQL Server nonclustered indexes are organized. There is a leaf level that contains index rows for all the index keys, including all duplicate values. The rows are sorted by the index key. Each index row has a pointer of some sort to the actual row of data in the table. (The rest of the details are out of scope for this post.)

So suppose you have a nonclustered index on first name. The leaf level would contain all the first name values in order: Ann, Anne, Bjorn, Bob, Boris, Charlie, Chuck, Dan, Dmitry, Elena, etc. If there were multiple table rows with the first name Boris, there would be multiple index rows in the leaf level of the nonclustered index with Boris.

Now suppose someone runs a query like this:

SELECT * FROM customers
ORDER BY FirstName;

How could this query be satisfied? There are two ways.  One way would be to read all the rows from the table, and then sort them by the FirstName column. The second way would be to scan the leaf level of the nonclustered index, in which the names are already sorted, and for each row, follow the pointer to the row in the table. The first solution is faster in general as SQL Server does have some very efficient internal sorting algorithms. Following a pointer for every single row in the table is usually much too expensive.  But there is a case where the second solution might be advantageous.  The SORT operator is considered a ‘blocking operator’ in a query plan because it has to consume all the input, and complete the sorting, before any output can be produced. Another name for a blocking operator is “stop-and-go”. The alternative to a blocking operator is sometimes called a “flow-through” operator. These flow-through operators read rows, process them, and either send them on to the next operator, or not, if they didn’t meet the right conditions. So they are basically consuming rows and producing rows concurrently, and the rows are flowing through the operator.  For more information about blocking vs non-blocking operators, see Craig Freedman’s excellent post here: https://blogs.msdn.microsoft.com/craigfr/2006/06/19/properties-of-iterators/

The important fact here is that NO output can be produced until all the input is consumed and the SORT is completed.

If you have an application that can display values as soon as SQL Server returns them, it can potentially be useful to start returning something as soon as possible. How do you measure the performance of your queries? How does an end-user interpret the performance? Is it the total time to complete a query, or is the time to get any input back (i.e. “time to first row”). If SQL Server could follow the leaf level of the nonclustered index, it would have the first value in sorted order immediately. And the next few values could be returned quickly, too. The total time to return ALL the rows might be slower than doing the sort, as SQL Server followed the pointer from every single index row, but there would be no waiting to start getting output from the query. This is where the hint FASTFIRSTROW came in handy. It told SQL Server to come up with a plan that would return the first row as quickly as possible, no matter how much time it took for ALL the rows. And frequently, this meant using a nonclustered index to access all the rows in the table.

The FASTFIRSTROW table hint was deprecated and replaced by the FAST N query hint.   This hint is similar, but is more flexible. With this new HINT, you are basically telling the optimizer to assume only N rows will be returned. In most cases this means the optimize will assume a rowcount of N for one or more operators in the plan. If N is one, this is synonymous with FASTFIRSTROW. So is a query with the FAST hint faster than not having it? Should you use it?

The only way to tell is to test it, of course.

But there is a caveat I want to point out. If you want to see if using the FAST hint is faster, you can’t rely on your estimated execution plans. In the AdventureWorks2014 database, there is a nonclustered index on the ProductID column in the Sales.SalesOrderDetail table.  Look at the estimated plans for two queries that sort by ProductID. The second one uses the FAST 1 hint:

SELECT * FROM Sales.SalesOrderDetail
ORDER BY ProductID;
GO

SELECT * FROM Sales.SalesOrderDetail
ORDER BY ProductID
OPTION (FAST 1);
GO

I get the plan shown here:

two sorting plans_smaller

If I look at the Query Cost relative to the batch, it looks like the query with the hint is infinitely faster than the first query without the hint (0% compared to 100%). But this is obviously not the case. If I look at the details of the index scan operator in the second plan, I see that the way the optimizer determined the best plan for returning one row was to assume there is only one row in the table. The Estimated number of rows for the Index Scan in the second query is 1! So the cost comparison is saying that if you had a query that was only sorting one row, that it would be almost infinitely faster than a query that needed to sort 121,000 rows. But this is only the estimated plan. When the queries are executed, all 121,000 rows will be processed by both of them, and it will usually turn out that the second one is not as fast.

Even though using a hint called FAST does sound appealing, it is not a magic button. In some cases, it can cause the optimizer to come up with a better or faster plan, and in some cases, it will do just the opposite.

Rolling Back a Transaction on a Memory Optimized Table

Posted by | On May 18th, 2017 | In In-memory OLTP, SQL Server
 

After three and a half weeks with DB Best, things are starting to settle in. After attending the SQL Server 2016 Workshop in Bellevue, and delivering one section on In-Memory OLTP and columnstore indexes, I flew to Houston and delivered the entire Workshop on Friday, May 12. I did have local support from Peter DeBetta, also from DB Best, and John Cook, from Microsoft.

The Houston Workshop reached capacity by the Monday before the event, but with John Cook’s help, along with his colleague Sean Kelley, we found a larger room and ended up with 19 attendees from 15 Houston area companies in attendance.

We will be scheduling more of these Hands-On Lab workshops in the future, and I’ll be announcing them here and on Twitter. My next public event will be SQL Saturday Los Angeles on June 10th. I hope to see many of you there! http://www.sqlsaturday.com/640/eventhome.aspx

There were lots of great questions from the folks in attendance, mostly centered around In-Memory OLTP, one of my favorite topics. Maybe I should do a whole blog series about this topic, like the awesome Niko Neugebauer has done for columnstore indexes? http://www.nikoport.com/columnstore/

In this post, I’m going to address one of the questions that came up in the Houston workshop. However, before I can answer the question, a bit of background information is needed.

During the workshop, I discussed the fact that with In-Memory OLTP, logging is optimized in several ways, and one of the optimizations is that SQL Server will not log any changes to rows until the change is committed. So, there is nothing that needs to be rolled back during database recovery, because only committed changes are ever persisted. But, if a transaction makes multiple changes, those changes are made to rows in memory. The question that came up is: How does SQL Server know that a change made to a row in memory is committed or not? What happens to the rows if the change is rolled back instead of committed?

To answer this, let’s see what a row in memory looks like for a memory-optimized table. Every row has a header followed by the payload (the data that the row contains). In this post, we’ll just be talking about the header. The first two values in the header are timestamps and these are the most important pieces of information to answer the question at hand. (The other values in the header will not be discussed in the post, but I’ll use this diagram anyway and future posts can describe these other values.)

in-memory row format

 

The Begin Ts (or “Begin Timestamp”) is a timestamp that indicates the time a row was inserted. The End Ts (or “End Timestamp) is a timestamp that indicates when a row was deleted. For rows that haven’t been deleted, there is a special value ∞ (infinity) that indicates the row is still active. The End Ts is needed because rows are not actually removed as soon as they are deleted, partly for performance reasons, and partly to support snapshot isolation. A running transaction, for its entire lifetime, continues to see the rows that were valid at the time the transaction started. Even if a row is deleted, transactions that started before the DELETE will continue to be able to read and return the row. The two timestamps define a row’s period of validity. If a row was inserted at timestamp 100 and deleted at timestamp 200, any other transaction that started between time 100 and 200 will be able to read that row.

To get back to the original question of what happens for a transaction that is rolled back, we need to know that every database that supports memory-optimized tables manages two internal counters:

  • The Transaction-ID counter – this identifies each active transaction. It is a global, unique transaction id value (xtp_transaction_id) for the in-memory OLTP portion of a transaction. It is incremented every time a new transaction starts, and is reset whenever the SQL Server instance restarts.
  • The Global Transaction Timestamp – this is the “commit” timestamp. It is also a global and unique value, but not reset on a restart. SQL Server increments the value monotonically each time a transaction issues a COMMIT. I’ll use the simple term timestamp, instead of Global Transaction Timestamp, to refer to this value. (Note that although this value is called a timestamp, it contains no information reflecting an actual time or date. It is an internal counter value which has no inherent meaning except in relation to other timestamp values.)

A simple way to differential these two values is that the Transaction-ID counter is used while a transaction is in progress to keep track of rows that it has changed, and the Global Transaction Timestamp is used for transactions that have been committed.

SQL Server keeps track of each active transaction in an internal transaction table. When a transaction starts, SQL Server increments the Transaction-ID counter, and assigns a unique transaction ID to the transaction. When the transaction issues a COMMIT, SQL Server generates a commit timestamp, which it stores in the internal table initially, and then writes the value to the header of affected rows, once it validates the transaction and hardens the changes to the log.

So, let’s see what this looks like.

Assume we have a row for a customer named Jane Wiggins in Kirkland, that was inserted at timestamp 100. (This means the transaction that inserted the row received a timestamp value of 100 when it committed.) The row might look something like this:

Original Row

Assume Jane has now moved to Redmond, so we need to update the row. All updates on memory-optimized tables are performed as a two-step process: delete the original row and insert a new one. When this update transaction starts, it receives a Transaction-ID counter, let’s say TX1. So during the transaction, before it’s committed, the End Ts of the original row, and the Begin Ts of the new row, both contain this internal value TX1. There are now two rows in memory, which look like the following:

Mid Update

Any other transactions that run while this TX1 transaction is in progress will read the original version of the row. The internal transaction table will indicate that TX1 has not committed yet, so it is known that no changes have actually happened to this row yet.

If this transaction was aborted, SQL Server sets the Begin Ts value of its new versions to infinity, thereby making them invisible to all transactions, and attempts to reset the End Ts fields of its old versions to infinity. The rows would then look like the following. You can see the original version now looks like it did before the update, and the new (aborted) version would not be readable by any other query.

Post Rollback

In case another transaction has already detected the abort, it might have created another new version and reset the End field of the old version. If so, the aborted transaction leaves the End field value unchanged. But in any case, the new version is not available to anyone after a rollback or abort of any kind.

So if you’ve read and understood this, you might just have even more questions now about In-Memory OLTP! Hopefully, I’ll answer more of them in upcoming posts. Feel free to ask your follow-up questions as a comment. I’d love to write about what you want to hear about.

Thanks!

Kalen

 

 

 

Automated sniffing technology of upgrading the unsupported SQL Server versions

Posted by | On May 16th, 2017 | In Customer Story, Database Upgrades, SQL Server | Tags: , , ,
 

Our customer needed a way to upgrade their unsupported SQL Server databases to newer versions of SQL Server. The major challenge was that they didn’t have visibility into their system. Moreover, Microsoft upgrade tools can’t handle such outdated database versions as SQL Server 2000.

With our innovative database discovery and monitoring tools, we were able to perform an inventory of their system. We developed an upgrade factory to get them back within Microsoft compliance. Now our customer has deep insight into their system and can take advantage of modern SQL Server capabilities with upgraded databases.

Read the rest of this entry »

Moving Oracle Forms to The Web as a Part of Database Migration Project

Posted by | On May 16th, 2017 | In Customer Story, Data Management, Database Migration, SQL Server | Tags: , , , , , , ,
 

A government healthcare organization wanted to move their Oracle Forms to a modern web-based UI. We offered to migrate their source Oracle database to SQL Server while converting the deprecated Oracle Forms to the modern-looking web-interface. We used a previously developed technology to automate the major part of Oracle Forms conversion.

To demonstrate the feasibility of this technology, we converted one the most complex forms. Having successfully completed the proof of concept, we convinced the customer to continue our partnership.

Learn how we approached this database migration project from the following video.

Read the rest of this entry »

Quick and easy automatic approach to database conversion

Posted by | On May 12th, 2017 | In Customer Story, Database Migration, SQL Server | Tags: , , , , ,
 

One of the largest US educational organizations has been running their data handling system in Oracle environment. They wished to reduce maintenance costs due to expensive Oracle licensing model, which includes an additional fee for the partitioning feature.

We provided the solution of migrating the customer’s database to SQL Server. Reconstruction of the original partitioning system in SQL Server allowed us to save extra money in addition to significantly lower SQL Server maintenance cost compared to Oracle.

Check how we performed the migration to SQL Server and thus saved our customer’s money in the following video:

Read the rest of this entry »

Developing mobile applications to shape the brighter future

Posted by | On May 12th, 2017 | In Mobile Development, Web & Software Development | Tags: , , , ,
 

Staying on the cutting edge of modern technologies, DB Best application development team constantly strive to offer the best solutions for our customers. Simply watch the following video to see the examples of some cool features we can add to your business app be it a cinema house, a shopping mall or any other place, crowded with people.

Let’s take a look at the impressive list of technologies that made the creation of this mobile application possible.

Read the rest of this entry »

Improving the web app performance while migrating from DB2 to SQL Server

Posted by | On May 4th, 2017 | In Customer Story, Database Migration, SQL Server | Tags: , , , , ,
 

One of the largest life insurance companies in the United States supported a reporting application. Their system included Java based web app and Informatica data management solution connected to DB2 database. To provide a web interface for reporting needs they employed a WebSphere server for hosting Java application. With this interface, a small circle of users could create reports using data imported from Informatica application.

Our customer wanted to reduce the maintenance costs and increase the application functionality. We came up with a solution to migrate the customer database to SQL Server and reconnect the application to the new database. As a result, the customer has been able to improve the application performance by taking advantage of modern SQL Server capabilities.

Check how we have migrated the application to the SQL Server platform in the following video.

Read the rest of this entry »

Remote migration from Oracle to SQL Server using SSIS

Posted by | On May 4th, 2017 | In Customer Story, Database Migration, SQL Server | Tags: , , , , , ,
 

A global financial software provider needed to migrate their billing solution from Oracle to SQL Server for one of their strategic clients located at the other end of the world. The actual migration had to be performed by the remote team. In order to do so, our customer was looking for an automated solution with detailed step by step instructions.

To meet the customer’s needs, we developed a command prompt migration utility accompanied by extensive documentation for a remote team. With our solution, the customer was able to migrate all data within the designated outage window by themselves without any issues.

Check the features of our migration solution in the following video.

Read the rest of this entry »

Database Compare Suite update 4.8 brings Netezza support

Posted by | On May 3rd, 2017 | In Database Migration, Database Upgrades | Tags: , , , , , , , , , , ,
 

We’re excited to bring you the latest release of Database Compare Suite. The new version 4.8 brings support for IBM Netezza versions 7.2.1 and higher along with some other improvements.

DBCS 4.8

Covering All Major Database Platforms

With the support of IBM Netezza, the total count of available database platforms in the Database Compare Suite reached the amazing total of 11. You can see the full list of supported database dialects on the image below. They include Amazon Redshift, Azure SQL Database, Greenplum, IBM DB2, IBM Netezza, Microsoft SQL Server, MySQL, Oracle, PostgreSQL, Sybase ASE and Teradata.

DBCS 4.8 connection

This means that you can use 55 heterogeneous and 11 homogeneous database pairs. That totals to an impressive number of 66 database pairs. For all of them you can perform for data and schema operations such as migration, synchronization, and comparison.

After establishing the connection to the Netezza database instance, you should notice the bright new set of icons that we’ve used for this platform. Check the left side of the following image.

DBCS 4.8 netezza

New Features

We updated the list of column types that you cannot use as a part of the sorting key.

For schema comparison and synchronization operations, we added ‘Shared libraries’ to the list of object types that you can either include or exclude from the operation. Check the image below to see where you can find this option.

DBCS 4.8 settings 2

Also, we improved the support of real numbers and large integers in the Fast Data Comparison operation. In addition, we implemented the full-fledged support of the negative intervals.

Download our free trial with no strings attached on the Database Compare Suite page.

Contact us to jumpstart your database migration project.

Altering In-Memory tables, and a FREE Workshop!

Posted by | On April 28th, 2017 | In Uncategorized | Tags: , ,
 

Hello World!

I hit the ground running this week as I started my new job at DB Best! I’m scheduled to present three one day workshops over the first two weeks of May, in Bellevue, San Francisco and Houston. I’m spending most of my days working through Hands-on Labs that workshop participants will be able to do, to learn about some of the most exciting features of SQL Server 2016! Here’s the link if you’re interested in the Bellevue event:

Modernize Your Data Platform Using Microsoft SQL Server 2016

Note that it’s FREE, and breakfast and lunch are provided. But capacity is limited, so don’t wait.  The registrations links for San Francisco and Houston should be available soon. We’ll announce it here and on twitter. Make sure to follow @dbbest_tech!

I know this is sounding a whole lot like a marketing blog. I apologize. As you probably know, I don’t really do marketing, I do tech! So, here’s a technical tidbit for you, related to the topics to be covered in the workshop. One of the topics we’ll look at is all the great new features and enhancements for In-Memory OLTP in SQL Server 2016. (In fact, I just finished a book about this topic!)  Hopefully, these improvements will encourage a whole lot more people to give In-Memory OLTP a test-run, to find out for themselves just how amazing a performance increase is possible!

One of the new features is the ability to ALTER a memory-optimized table or index. We can add new indexes, we can add or drop columns and we can change the bucket count of our hash indexes. One issue with modifying a memory-optimized table is that SQL Server will rebuild the entire table. It’s all done in memory, of course, but for a huge table, it can still take some time. More importantly, it takes MEMORY! You’ll need to have enough extra memory for a whole new copy of the table, and only after the ALTER operation is finished, the garbage collection process can free up the original memory used by the table.

There is metadata available that allows you to determine when a table has been rebuilt. In a couple of the DMVs for memory-optimized tables and indexes, there is a column called xtp_object_id. The regular object_id for your memory-optimized tables does not change when it is rebuilt due to an ALTER operation, but the xtp_object_id will change. Let’s see an example.  I’m going to assume you have a database that has been set up to hold memory-optimized tables.

 

First create a simple table:

DROP TABLE IF EXISTS dbo.OrderDetails;
GO

CREATE TABLE dbo.OrderDetails
(
OrderID int NOT NULL,
ProductID int NOT NULL,
UnitPrice money NOT NULL,
Quantity smallint NOT NULL,
Discount real NULL

INDEX IX_OrderID NONCLUSTERED HASH (OrderID) WITH ( BUCKET_COUNT = 1048576),
INDEX IX_ProductID NONCLUSTERED HASH (ProductID) WITH ( BUCKET_COUNT = 131072),
CONSTRAINT PK_Order_Details PRIMARY KEY
NONCLUSTERED HASH (OrderID,       ProductID) WITH ( BUCKET_COUNT = 1048576)
) WITH ( MEMORY_OPTIMIZED = ON , DURABILITY = SCHEMA_AND_DATA );
GO

 

Now let’s look at some of the basic index metadata:

SELECT i.index_id, i.object_id, xtp_object_id, name as index_name
FROM sys.indexes i JOIN sys.dm_db_xtp_memory_consumers mc
ON i.object_id = mc.object_id AND i.index_id = mc.index_id
WHERE i.object_id=object_id(‘dbo.OrderDetails’);
GO

 

Here are my results:

xtp_object_id screen1

 

Notice the different values for object_id and xtp_object_id. Now make a change to the index. In this example, I’m changing the bucket count.

 ALTER TABLE dbo.OrderDetails
ALTER INDEX IX_OrderID
REBUILD WITH (BUCKET_COUNT=2097152);
GO

When we look at the basic index metadata again, I get the results below:

xtp_object_id screen2

Notice the results are almost the same as the first result set. The object_id is the same, but the xtp_object_id has been increased by one.

You can try this again with another change to the table.  This next ALTER will add a new index:

ALTER TABLE dbo.OrderDetails
ADD INDEX IX_UnitPrice NONCLUSTERED (UnitPrice);
GO

Now the metadata shows four rows, since there are four indexes, with the same object_id. Again, the xtp_object_id value has changed.

xtp_object_id screen3

So the new values of xtp_object_id are what shows us the table is being rebuilt. Of course, if you run out memory on while performing these operations on a table with a large number of rows, that would also be an indication that the table was actually being recreated!

I hope to see you at one of my SQL Server workshops for DB Best in May where you can try your hand queries against memory-optimized tables as well as using several other SQL Server 2016 features.