Objectivity/DB Spark Adapter : Getting Started
Getting Started
What is the Objectivity/DB Spark Adapter?
The Objectivity/DB Spark Adapter is part of ThingSpan, a versatile, highly scalable graph and data fusion platform. The Objectivity/DB Spark Adapter enables you to use an Objectivity/DB data store in the Apache SparkTM and Hadoop® ecosystem. Doing so combines Objectivity/DB’s distributed data storage with the parallel processing of Spark SQL and Hadoop.
The Objectivity/DB data store is called a federated database. The federated database not only provides ideal storage for metadata, but is well suited for derived data and application data.
Integration With SparkSQL
The Objectivity/DB Spark Adapter enables integration between Objectivity/DB and the Spark environment, specifically, through the SparkSQL module. The adapter lets Spark driver applications read from and write to Objectivity/DB federated databases through a Spark data frame. As a reminder, a Spark data frame is a table where each row defines a single unit or instance and each column represents a variable of the given type. A data frame is how Spark supports schema.
You can create a Spark driver application in Scala, Java, or Python (the tutorial uses Scala).
Objects in the Federated Database
A federated database’s schema describes the types of objects that it can store. The schema contains a description of every class defined by applications that access the federated database. There are several approaches for creating schema:
You can create the schema for a new class by adding the first instance of that class using Spark.
Be aware that the data types you use for fields (called attributes in Objectivity/DB) are translated to Objectivity/DB types as described in Schema Generation.
You can add schema using the DO query language or by sending commands to the Objectivity REST server.
Every persistent object in the federated database is an instance of one of the classes described in the schema and has a unique object identifier or OID that distinguishes it from other persistent objects.
Objects can have relationships (called associations in Objectivity/DB) to other objects through references or lists of references.
Learn More
You can learn how to use the Spark adapter by starting with the tutorial:
Spark Adapter Tutorial
For additional details, see:
Spark Adapter Reference
For additional deployment scenarios, see:
Advanced Topics