call us toll-free +1 855 855 3600
 
  • Home
  • >
  • Blog
  • >
  • So You Think You Can Search - Comparing Microsoft SQL Server FTS and Apache Lucene
DB Best Chronicles
 
Talks on Big Data, Mobile Apps, Web and Software Development

So You Think You Can Search – Comparing Microsoft SQL Server FTS and Apache Lucene

Posted by | On February 8th, 2013 | In Big Data, Web & Software Development
 

Prologue

In 1999, when I was in my first year of university, if I wanted to check my e-mail I had to come into a lab. I used to come into the class, open browser, type address and then go out to grab some coffee. Usually when I came back I was really happy to see that at least half of the page had been loaded.

Today people are not so patient. They used to get response from the web-sites at least in a few seconds and if your application is not that fast, you have a problem. In fact, “few seconds” is a very optimistic term. Nowadays we measure response time with milliseconds.

If you want to create something really popular in the Web (and your application operates big amount of data), sooner or later you will face the need to implement full text search engine over your system.

My Goals

In this post I will try to describe and compare two technologies – Microsoft SQL Server Full Text Search and Apache Lucene. The first one is an embedded SQL Server feature and the second one is a third-party software library, originally developed in Java and ported to many other platforms, including .NET.

lucene vs sql server fts

How to build a super fast search for your web application.

Read the rest of this entry »

SAP HANA – a New Database Management System

Posted by | On October 8th, 2012 | In Big Data, Database Migration
 

In 2007 I worked for a company which provided data warehousing solutions. Our biggest client database was about 2-3 TB which was not so much even five years ago, but because of extremely complicated ETL and complex reports we had a lot of troubles.

We spent hundreds of hours monthly on code optimization, but every 30 minutes saved on ETL and processing were compensated with monthly data growth and we had to start over and over.

Finally it comes the time when we couldn’t improve anything else in programming code and company decided to drastically upgrade the hardware. But extremely expensive servers and disk storages gives us only several percent performance improvement. We thought about migration to another platform but preliminary estimations showed that every traditional RDBMS has the same bottleneck – disk storage. Maybe, if it was happening today, we probably would turn to fundamentally new technologies.

The whole software development industry is highly innovation-oriented, but RDMBSs always are perceived as something very conservative. I still have record about ANSI-92 SQL knowledge in my CV and almost all fundamental works from 70s are up to date, but in spite of that data management software industry is very fast developing and great new products comes to market each year.

Read the rest of this entry »

How to Implement a SQL Server View Based on Temporary Table

Posted by | On August 13th, 2012 | In Big Data, Database Migration
 

If you ask any experienced SQL Server developer: “Can I build a view on temporary table?” the answer will be “Of course not. And why on Earth you need it?” Well, Oracle developers can tell you why. Temporary tables are very useful when you have to separate data between users/sessions and creating view on them is just convenient way to store some logic. SQL Server developers used to deal without it but when they are trying to move database from Oracle they will have to find workaround because it’s pretty common pattern in Oracle.

Let’s take a look at how it works.

SQL Server View

First thing that comes to mind is to use regular table instead of temporary. To emulate data separation between sessions we can add “spid” column with default value @@spid (system function that returns the session ID of the current user process). Than we need to create view named exactly like original temporary table (of course we have to name our regular table differently) and add where condition “spid = @@spid”. So select from the view will return only that part of data, which was populated in current connection. Next challenge here is that in Oracle data from temporary table is automatically deleted on rollback/commit.
Read the rest of this entry »

How to Create Autonomous Transactions in SQL Server

Posted by | On August 13th, 2012 | In Big Data, Database Migration
 

You probably heard about autonomous transactions in Oracle. In a few words, using just one pragma directive in a PL/SQL block you can isolate it from the callers’ context, so it becomes independent transaction. It’s quite useful if you want organize data audit and is pretty widely used in production databases. As you might guess these autonomous transactions do not have direct equivalent in Microsoft SQL Server. The only way to isolate a Transact-SQL block from a transaction context is to open a new connection. There are several options here, so let’s start with SSMA approach.

Autonomous Transactions in SQL Server

Assume that we have some logging logic in a stored procedure named “dbo.LogError” and we want to commit it even in case when main transaction will be roll backed.

As in previous case with UDF exceptions SSMA use extended procedure (now – to open a new connection). SSMA will wrap our routine like that:
Read the rest of this entry »

How to Raise an Exception in SQL Server User Defined Functions

Posted by | On July 2nd, 2012 | In Database Migration
 

There are a lot of differences between Oracle and MS SQL Server and you will face many of them trying to move your database from one platform to another. If you use SSMA, which is a good thing to do, you can avoid huge amount of manual work, but you will have to solve some specific problems by yourself after SSMA job will be done. One of them can be raising errors from UDF, which you can easily do in Oracle and can’t in SQL Server without some workarounds because of T-SQL limitations for UDF.

SQL Server User Defined Functions

So let’s create some pretty simple UDF in Oracle and take a look on how we can keep its full functionality in MS SQL.

Oracle:

1
2
3
4
5
6
7
8
CREATE OR REPLACE FUNCTION DIVIDE (a int, b int)
RETURN NUMBER IS
BEGIN
IF (b=0) THEN
RAISE_APPLICATION_ERROR(-20002, 'You cannot divide by zero!');
END IF;
RETURN a/b;
END DIVIDE;

 

If you try to rewrite it in T-SQL just with obvious syntax and data types changes like that:

1
2
3
4
5
6
7
CREATE FUNCTION DIVIDE (@a int, @b int)
RETURNS float(53)
BEGIN
IF (@b = 0)
RAISERROR(59998, 16, 1, 'You cannot divide by zero!')
RETURN CAST(@a AS float(53)) / @b
END

you will get an error:
Read the rest of this entry »