40
Open Connectivity BI, Integration, and Apps on Couchbase using ODBC and JDBC June, 2015

Open Connectivity: BI, Integration and Applications on Couchbase Using ODBC and JDBC: Couchbase Connect 2015

Embed Size (px)

Citation preview

Open Connectivity

BI, Integration, and Apps on Couchbase using ODBC and JDBC

June, 2015

• Who I am

• Driver(s) Overview

• Relational Model

• Demos

• ODBC in-depth

• JDBC in-depth

• QA

Contents

• Worked in the data access space for eight years and counting• ODBC, OLEDB, JDBC, ADO.NET, ODBO, XMLA

• Core developer for the current generation of Simba’s data access technologies

• Collaborated at an Engineering Level with Simba ISV Customers to design and implement data drivers that are today being shipped world wide

Kyle at a glance

• Simba connects people to data.

• HQ’ed in Vancouver, BC.

• 100ish employees.

• Founded in 1991.

• In 1992, Simba co-authored the original ODBC standard with Microsoft.

• Simba produces the SimbaEngine® SDK and drivers for the leading data sources on multiple platforms.

Simba Technologies at a glance

Simba Technologies at a glance

• Partnership to create read/write ODBC and JDBC drivers• ODBC 3.80

• JDBC 4.0 and 4.1

• Allow easy access to data within Couchbase from your favourite BI and ETL tools

Why is Simba here?

What is an ODBC / JDBCDriver?N1QL mode to alloweasy and advancedanalytics

• Couchbase is NoSQL• Dynamic schema, documents vary within a bucket

• ODBC and JDBC are SQL• Expect a fixed schema, each column is one type

• Must map from dynamic schema data to fixed schema data

Schema(less)

• SQL• Catalog, Schema, Table

• Couchbase• Namespace, Keyspace

Schema => Namespace

Table => Keyspace (sort of)

Relational Mapping

http://www.prabathsl.com/2013/02/document-oriented-database_14.html

Sample JSON Document:

{“Id” : 1, “Name”: “Couchbase”, “Values” : [V1,V2]}

Simple Flattening

Id Name Values[0] Values[1]

1 Couchbase V1 V2

Sample JSON Document:

{“Id” : 1, “Name”: “Couchbase”, “Values” : [V1,V2]}

Parent

Child

Re-Normalization

Id Name

1 Couchbase

Id Index Value

1 0 V1

1 1 V2

Demos

• C API

• Versions: 2.x, 3.0, 3.52, 3.80, etc…• Non-Windows platforms are ODBC 3.52

• Driver Managers• Windows, iODBC, unixODBC, etc…

• All functions have return codes• SQL_SUCCESS, SQL_SUCCESS_WITH_INFO, SQL_ERROR, etc…

ODBC Technicals

• SQLHENV, SQLHDBC, SQLHSTMT, SQLHDESC

• Relationship is one-to-many

ODBC: Handles

• Allocate with SQLAllocHandle, free with SQLFreeHandle

• Ensure you set the version of ODBC in use

SQLSetEnvAttr(

hEnv,

SQL_ATTR_ODBC_VERSION,

(SQLPOINTER)SQL_OV_ODBC3_80,

SQL_IS_INTEGER)

ODBC: Environment

• Allocate with SQLAllocHandle, created from SQLHENV

• Maintains the actual connection to Couchbase

• Create child statement objects to do work

• Disconnect with SQLDisconnect, free with SQLFreeHandle

ODBC: Connection

Open a connection using SQLDriverConnectSQLDriverConnect(

hDbc,

windowHandle,

connStr, SQL_NTS, connStrLen

&outStrLen,

SQL_DRIVER_COMPLETE)

Last parameter allows driver to prompt for information if connStr doesn’t contain all necessary information.

ODBC: Connection

• Specify a Driver or DSN• Driver: Must specify all options in connection string• DSN: Can specify options in connection string

Example:

“DSN=Couchbase;UID=kylep;PWD=testPassword;”

ODBC: Connection String

• Allocate with SQLAllocHandle, created from SQLHSTMT

• Used for issuing queries, retrieving catalog metadata

• Free with SQLFreeHandle

ODBC: Statement

• Use SQLExecDirect for one-off queries

• Use SQLPrepare and SQLExecute for repeated queries

SQLExecDirect(hStmt, “<query>”, SQL_NTS)

or

SQLPrepare(hStmt, “<query>”, SQL_NTS)

SQLExecute(hStmt)

ODBC: Querying

• After execution, there is a cursor on the results, positioned before the first row

• Use SQLFetch to move the cursor

• SQLGetData can be used to fetch cell by cell

• SQLBindCol can also be used, much more efficient

ODBC: Results

• Reuse connections so driver caches are effective

• Use SQLBindCol over SQLGetData

• Use array fetches with SQLBindCol

• Use SQLPrepare once, SQLExecute multiple times with parameters for loading data

• Use parameter arrays when binding parameters with SQLBindParameter

• Bind types to match reported parameter or column types

ODBC: Performance Tips

SQLGetData vs. SQLBindCol

10 million rows, 3 columns of wide char, integer, decimal values

ODBC: Performance Tips

Method Time (s)

SQLGetData 141.428

SQLBindCol 6.102

SQLBindCol (array[100]) 2.747

• Is the Java version of ODBC

• Versions: 3.0, 4.0, 4.1, etc…• JDBC version is tied to Java version

• Simba will supply JDBC 4.0 and 4.1 versions

• There is a driver manager, but role is very limited

• Errors are reported via exceptions, warnings via getWarnings()

JDBC Technicals

• Relationship is again one-to-many

JDBC: Object Hierarchy

• Never used directly by your code

• Must be loaded by referencing using Class.forName()

• FQCNs• com.simba.couchbase.jdbc4.Driver

• com.simba.couchbase.jdbc41.Driver

• URL• “jdbc:couchbase://<host>:<port>/<schema>;UseN1QLMode=0/1”

JDBC: Driver

• ExampleString url = “jdbc:couchbase://localhost:8093/default;”Class.forName(“com.simba.couchbase.jdbc4.Driver”);Connection con = DriverManager.getConnection(url);

• With JDBC 4.0 and later, Class.forName() can be omitted

JDBC: Driver

• Used directly by your code

• FQCNs• com.simba.couchbase.jdbc4.DataSource

• com.simba.couchbase.jdbc41.DataSource

• Can be used to programmatically set connection properties by using functions instead of a connection string

• Allows use of advanced features such as PooledConnection

• Used less often than Driver

JDBC: DataSource

• Can create child statement objects for queries

• Can create DatabaseMetaData objects for metadata via getMetaData()

• Common pitfalls• Not closing connections; use a finally block to ensure it is closed

• Not checking warnings, ever; call getWarnings() regularly

• Creating a new connection for every operation; reuse connections where possible

JDBC: Connection

• DatabaseMetaData• Created by Connection objects

• Actually only one object per connection, cached and reused

• Provides access to catalog metadata• getCatalogs(), getTables(), getColumns(), etc…

• Provides access to database metadata• getDatabaseProductVersion(), getIdentifierQuoteString(), etc…

JDBC: Database MetaData

• Created by Connection objects

• Three different types• Statement – for one-off querying

• PreparedStatement – for queries with parameters

• CallableStatement – for stored procedures with output parameters*

• Statement objects will eventually dispose of themselves once out of scope, but best practice is to close() them when done

JDBC: Statement Objects

• Cannot use parameters

• Use execute(), executeQuery(), or executeUpdate() to execute SQL or N1QL queries

JDBC: Statement

execute() Example

Statement stmt = conn.createStatement();try {if (stmt.execute(“select * from beer-sample”)) {

ResultSet rs = stmt.getResultSet();rs.close();

}}finally {stmt.close();

}

JDBC: Statement

• For use with parameters

• Can get metadata about results before execution with getResultSetMetaData()

• Can get metadata about parameters using getParameterMetaData()

• Use set*() functions to provide parameter values

JDBC: PreparedStatement

• When loading data, use batches• Set all required parameters for one execution

• Call addBatch() to add the current set of parameters

• Call executeBatch() to execute all added batches at once

• Set parameters as reported types to avoid conversion overhead in the driver

• Reuse the statement for multiple executions

JDBC: PreparedStatement

• Represents query results or catalog metadata

• Describe the result set using getMetaData()

• Move through result using next()• Can use isAfterLast(), isFirst(), isLast(), etc. to check cursor

position.

• Retrieve cell values using get*() methods• The driver supports all conversions between types listed by the JDBC spec

• Try to retrieve as requested type to avoid conversion overhead

• Remember to check wasNull() after calling get*() method

JDBC: ResultSet

ResultSet Example

ResultSet rs = stmt.executeQuery(“<query>”);try { int numColumns = rs.getMetaData().getColumnCount();while (rs.next()) { for (int i = 0; i < numColumns; ++i) { System.out.println(rs.getString(i)); }}

}finally {rs.close();

}

JDBC: ResultSet

Q & A

simba.com