Objectivity/DB Spark Adapter : Spark Adapter Tutorial
Spark Adapter Tutorial
The Spark Adapter tutorial will show how a Spark driver application interacts with an Objectivity/DB federated database, using Spark SQL and Scala.
Tutorial Setup
1. Download the tutorial sources from the Objectivity Developer Network Learning Center.
2. Extract the files, noting the top-level ObjySparkTutorial directory.
3. If you haven’t already done so, install and configure ThingSpan by following the ThingSpan Setup steps in the ObjySparkTutorial\readme file. These steps describe:
Where to find the ThingSpan installer.
How to set up your environment variables and Objectivity license file.
4. If you haven’t already done so, follow the Spark Setup steps in the ObjySparkTutorial\readme file. These steps describe:
Where to find the required third-party tools.
How to set up your environment variables for these tools.
5. Follow the Tutorial Setup steps in the ObjySparkTutorial\readme file. These steps:
Build the tutorial application.
Create an Objectivity/DB federated database.
Note:You must complete the setup steps in the readme file before working through the various tutorial tasks.
The tutorial uses the Gradle build automation system to build the sample Spark driver application and to run various tasks. For more information about Gradle, refer to their website.
Tutorial Task Overview
The following table summarizes the tasks that you will complete as you work through this tutorial.
Tutorial Tasks
Topic
Store new objects in the federated database using a Spark SQL data frame.
Perform an inner join on a data frame loaded from the federated database and a data frame loaded from a JSON file, writing results back to the federated database.
Load an Objectivity/DB data frame in which objects of a given type are identified by their OIDs, then modify and write back particular objects.
Add new schema for objects that have relationships to each other, then learn how to distribute instances of these types using a round robin strategy across multiple storage locations.
Create objects and establish relationships between them using inner joins.