NoSQL Databases
CData Virtuality supports a range of NoSQL databases, which are non-relational systems designed to handle large volumes of unstructured or semi-structured data. Unlike traditional relational databases, they offer flexible schemas and are optimized for scalability and fast access across distributed environments. NoSQL database connectors support mapping JSON-to-relational structures, including mapping nested documents and arrays. Since NoSQL databases are not structured like SQL databases, it has to map data into a form that is queryable.
Drivers
CData offers JDBC drivers that connect to many NoSQL sources, such as MongoDB and Cassandra.
Metadata Discovery and Data Type Conversion
CData Virtuality reads the schema and infers columns from keys and fields in JSON and XML data. It performs a row scan to inspect the first X number of rows of data to infer the data structure. The number of rows to scan can be adjusted via the RowScan property. Increasing the RowScan number has an impact on the first retrieval of metadata, but in the next retrieval, the metadata is cached for future queries. CData Virtuality then builds virtual tables to represent the data as tables that can be queried. If the default data model does not meet requirements, you can adjust how the data is modeled. CData Virtuality also has the ability to convert NoSQL data types into the appropriate SQL types. When necessary, CData Virtuality converts nested JSON objects into parent and child views.
SQL Query Translation
When you write a SQL query in CData Virtuality, it translates your SQL into the native API or query language of that system, such as MongoDB query language or Elasticsearch Query DSL. For unsupported SQL features, such as joins, CData Virtuality executes them in its own query engine.
Query Pushdown
Wherever possible, CData Virtuality tries to push down as much filtering and projection as possible into the NoSQL backend, rather than in the CData Virtuality engine. See Pushdown for more information.
Caching and Performance
CData Virtuality lets you cache results from NoSQL queries for faster results when repeating a search.
Supported Databases
Among the supported NoSQL databases are the following:
Apache Cassandra
Couchbase