Relational databases (RDBMS) like SQL Server, Oracle, MySQL and even
Access all store tabular data row-by-row. This structure is best for
transactional/operational systems that require large numbers of
concurrent insertions. With indexes, it can also provide realistic
query response times for row-based queries that do not frequently
require aggregations or joining of many tables.
Data
analysis often requires aggregation of data as well as merging of data
located in multiple disparate tables. When dealing with these types of
queries, relational databases reach their limits pretty quickly. The
only way to extend these limits are by putting in stronger hardware and
pre-aggregating data to reduce the amounts of calculations that occur
in real time.
The ElastiCube Columnar Database
ElastiCube data is held in a Columnar Database Managment System (CDBMS) that stores data field-by-field. Each field is individually stored in a memory-mapped file, the same mechanism the Windows operating system pagefile uses for memory dumping and loading.
When a query is executed over an ElastiCube, only fields referenced in the query need to be loaded into memory. This leaves enough space for actually
processing the query entirely in memory without any read/write to the hard-drive - the prime reason for poor performance of queries. Once a field is no longer used, it is removed from memory and its consumed space is released.
This approach has several advantages:
- Query response time: queries over data sets containing millions of rows of data return in seconds even under modest hardware configurations such as desktop computers
- Materialization time: ElastiCubes do not require pre-aggregations and/or creation of indexes to assure fast query response, therefore the actual creation of an ElastiCube takes a fraction of the time of a data mart or an OLAP cube
- Storage space: Pre-aggregations and and creation indexes are not needed to assure fast query response, making an ElastiCube's size significantly smaller than a datamart or an OLAP cube
Technical Considerations
ElastiCube technology is most reliant on the following hardware specifications:
- 32/64-bit Operating System: ElastiCubes work both under 32-bit and 64-bit Windows environments (different installations are required). Because ElastiCubes use memory-mapped files, the total size of an ElastiCube is limited to approximately the size of addressable memory: 4GB under 32-bit and (4GB)^2 under 64-bit.
- The amount of free RAM: ElastiCubes are designed for in-memory processing of queries. The more available RAM exists, the better..
- The number of CPUs and CPU cores: ElastiCubes utilize multi-core/multi-CPU configurations to achieve parallel processing.