call us toll-free +1 855 855 3600
 
  • Home
  • >
  • Blog
  • >
  • Episode 2 – XML and Binary document formats
DB Best Chronicles
 
Talks on Big Data, Mobile Apps, Web and Software Development

Episode 2 – XML and Binary document formats

Posted by | On June 11th, 2012 | In Big Data, Database Migration, Social Commerce, Web & Software Development | Tags: , , , , ,
 

Click here to download this post as Word Document


Document Storage Formats – An Introduction

Document storage databases are all the buzz these days. Interestingly enough they are actually not a very recent invention. Already 20 years ago there were object oriented databases using the very same concepts.

In this four part mini blog series I will take a bird’s eye view at the connection and relationships between object oriented databases, object stores, serialization and document storage. I will present and explain the most common document storage formats and will try to find the reason why object stores are all the buzz today but were not 20 years ago.


Episode 2 – Binary and XML document formats

You could argue that there is no such thing as a binary document storage format. Binary serialization has been around for a long time, not just since developers want to persist objects directly to long term storage (without first having to map between object properties and database columns). When passing objects across process boundaries, which is known as marshaling, developers have to first serialize the objects’ state. In its simplest form the binary document would be a string of bytes representing the values of all variable object properties. For the application to be able to interpret the binary document it had to somehow know about the object’s class definition. The class definition would either be part of the application code, as a declarative class definition, or the application would learn about it from type libraries. Either way, the binary documents were neither humanly readable, nor machine interpretable without explicit knowledge of its source or destination object’s class definition.

Recent implementations for binary document formats are more sophisticated. They contain meta data describing and identifying the structure from which the object’s variable parameters have been serialized. While this increases the resulting document’s size, it makes it more portable and allows its data to be interpreted even after the original object’s class definition has long been forgotten.

Obviously knowing the structure of any document is very useful in many respects. That is why structured document formats contain more or less meta data which self-describes and “communicates” their structure; binary document formats are no exception anymore.

While binary document formats usually contain only very little meta data and therefore are considered the “leaner” of the document formats, XML might be placed at the other end of that scale, being a very “bloated”, or nicer put, a very “verbose” document format.

According to Wikipedia “XML” is defined as follows:
Read the rest of this entry »

Episode 1 – Introduction and definitions

Posted by | On May 23rd, 2012 | In Big Data, Database Migration, Social Commerce, Web & Software Development | Tags: , , , , ,
 

Click here to download this post as Word Document


Document Storage Formats – An Introduction

Document storage databases are all the buzz these days. Interestingly enough they are actually not a very recent invention. Already 20 years ago there were object oriented databases using the very same concepts.

Document Storage Formats - DB Best

In this four part mini blog series I will take a bird’s eye view at the connection and relationships between object oriented databases, object stores, serialization and document storage. I will present and explain the most common document storage formats and will try to find the reason why object stores are all the buzz today but were not 20 years ago.


Episode 1 – Introduction and definitions

Document Storage: A document in context of software development is considered, plain and simple, a computer data-file or data-set. For different purposes the data file or set might contain different content. Often, when the content of the document requires a predictable structure, meta data is being introduced to define the document data’s formatting. In such case you consider the document’s data to be structured, otherwise unstructured. An example for unstructured data is a word document; an example for structured data is an XML document.

Read the rest of this entry »