Quick Start
The library has different modules that can be imported separately
- BigQuery
libraryDependencies += "io.github.data-tools" %% "big-data-types-bigquery" % "{version}"
- Spark
libraryDependencies += "io.github.data-tools" %% "big-data-types-spark" % "{version}"
- Cassandra
libraryDependencies += "io.github.data-tools" %% "big-data-types-cassandra" % "{version}"
- Circe (JSON)
libraryDependencies += "io.github.data-tools" %% "big-data-types-circe" % "{version}"
- Core
- To get support for abstract SqlTypes, it is included in the others, so it is not needed if you are using one of the others
libraryDependencies += "io.github.data-tools" %% "big-data-types-core" % "{version}"
Versions for Scala , and are available in Maven
In order to transform one type into another, both modules have to be imported.
How it works
The library internally uses a generic ADT (SqlType) that can store any schema representation, and from there, it can be converted into any other. Transformations are done through 2 different type-classes.
Quick examples
Case Classes to other types
//Spark
val s: StructType = SparkSchemas.schema[MyCaseClass]
//BigQuery
val bq: List[Field] = SqlTypeToBigQuery[MyCaseClass].bigQueryFields // just the schema
BigQueryTable.createTable[MyCaseClass]("myDataset", "myTable") // Create a table in a BigQuery real environment
//Cassandra
val c: CreateTable = CassandraTables.table[MyCaseClass]
There are also extension methods
that make easier the transformation between types when there are instances
//from Case Class instance
val foo: MyCaseClass = ???
foo.asBigQuery // List[Field]
foo.asSparkSchema // StructType
foo.asCassandra("TableName", "primaryKey") // CreateTable
Conversion between types works in the same way
// From Spark to others
val foo: StructType = myDataFrame.schema
foo.asBigQuery // List[Field]
foo.asCassandra("TableName", "primaryKey") // CreateTable
//From BigQuery to others
val foo: Schema = ???
foo.asSparkFields // List[StructField]
foo.asSparkSchema // StructType
foo.asCassandra("TableName", "primaryKey") // CreateTable
//From Cassandra to others
val foo: CreateTable = ???
foo.asSparkFields // List[StructField]
foo.asSparkSchema // StructType
foo.asBigQuery // List[Field]
foo.asBigQuery.schema // Schema
Check the complete guide on how to create a new type to understand how the library works internally