ESS Hinter den Kulissen (ESS Under the Hood) - FMK 2018. Mai Freitag... · ESS Hinter den Kulissen...

Preview:

Citation preview

FileMaker Konferenz 2010

ESS Hinter den Kulissen(ESS Under the Hood)

Galt JohnsonFileMaker, Inc.Galt Johnson

FileMaker, Santa Clara, California

FileMaker Konferenz 2010

Curriculum VitæRed Brick Systems (Ralph Kimball)

• Designed logical SQL compiler

• Materialized view support

IBM DB2 Alphablox

• Architect (Data Stack)

• Multi-dimensional database support (Essbase, MSAS)

• MDX compiler and execution engine

FileMaker

• External SQL Sources

• FQL Engine

FileMaker Konferenz 2010

Agenda

• What is ESS?

• SQL Database Intro

• Basic ESS Architecture

• ESS from the Inside

• Performance Considerations

• Best Practices

FileMaker Konferenz 2010

Agenda

• What is ESS?

• SQL Database Intro

• Basic ESS Architecture

• ESS from the Inside

• Performance Considerations

• Best Practices

FileMaker Konferenz 2010

What is ESS?

• External SQL Source (ODBC Data Source)

• Gives FileMaker applications access to SQL data sources

• Hides almost all of the SQL “ugliness” from FileMaker developers

• Provides FileMaker developers with a familiar framework in which to develop their applications

FileMaker Konferenz 2010

Design Goals• Goals

• Easy to use

• Provide FileMaker-like functionality on top of SQL data sources. ESS tables are just like FileMaker tables.

• Good performance and scalability

• Anti-Goals

• Not a SQL reporting tool

• Not for processing large amounts of data

FileMaker Konferenz 2010

Agenda

• What is ESS?

• SQL Database Intro

• Basic ESS Architecture

• ESS from the Inside

• Performance Considerations

• Best Practices

FileMaker Konferenz 2010

SQL Basic Terminology

FileMaker SQL

Table Table

Record Row

Field Column

Index Index

FileMaker Konferenz 2010

SQL Terminology

Key

• Column or set of columns whose values uniquely identify all rows in a table

Examples

• SKU in a product table

• Order Id and Line Item Number in an “Order Details” table

• Customer ID in a customer table

But not

• Last name in an “Employee” table

• Time of day in an “Orders” table

• ProductID in an “Inventory History” table

FileMaker Konferenz 2010

Primary Keys• Primary Key

• The key the SQL DB uses to identify a row uniquely

• Surrogate

• Not related to the data itself (e.g. auto-increment column, GUIDs)

• Often called ID (Customer ID, Shipment ID, etc.)

• Natural

• Part of the data

• State and City names

• SKU

• UPC

FileMaker Konferenz 2010

More Terminology• ODBC

• A technology developed by Microsoft.

• Programming API

• SQL standardization

• De facto industry standard

• FileMaker can access ODBC data sources:

• Microsoft SQL Server 2000, 2005 and 2008*

• Oracle 9i, 10g and 11g*

• MySQL Community Edition 5.0 and 5.1* Community Edition

* New to FileMaker 10

FileMaker Konferenz 2010

Agenda

• What is ESS?

• SQL Database Intro

• Basic ESS Architecture

• ESS from the Inside

• Performance Considerations

• Best Practices

FileMaker Konferenz 2010

Third Party

ESS Architecture - Client

FileMaker Pro 10.0

ESS ODBC SQLDatabase

FileMaker Konferenz 2010

HostClient

ESS Architecture - Server

FileMaker Pro 10

ESS ODBC Driver SQL

Database

FileMaker Server 10

ESS

FileMaker Konferenz 2010

FileMaker ESS Tables• Treated similarly to external table references

• Shadow fields are created for SQL columns

• Primary key and/or unique columns are automatically discovered

• Unsupported types are not available

• VARBINARY, BLOB, etc.

• INTERVAL types

• Some validations are automatically imported

• Auto-increment (SQL Server and MySQL)

• Text field lengths

• But, ESS tables appear in local table list as shadow tables.

FileMaker Konferenz 2010

ESS Terminology

Shadow TableReferences ODBC data

source

FileMaker Konferenz 2010

ESS Terminology

Shadow fields

Supplementary fields

Reload shadow fields

FileMaker Konferenz 2010

FileMaker ESS Tables – Additional Options

• Some aspects of shadow fields can be modified

• Auto-Enter values

• Some validations (range, value list, calc)

• Some cannot be modified

• Data type is strictly enforced (e.g. no strings in number columns)

• Some range validations and data integrity rules are enforced by the SQL DB.

• Additional fields can be added

• Summary fields

• Calculated fields

• Cannot be stored or indexed

FileMaker Konferenz 2010

FileMaker ESS Tables – Additional Options

• Shadow fields may be removed

• FileMaker will ignore those fields in the SQL table

• The underlying SQL table is not altered in any way

• Supplemental fields

• Calculated fields based on values in shadow fields or other calculated fields

• Never stored or indexed

• Summary fields

• Value Lists

• FileMaker Pro v10 and later clients only

• The ESS field may appear as either the primary or secondary field, or both.

FileMaker Konferenz 2010

SQL to FileMaker Data Type Mapping

FileMaker SQL

Text CHAR, VARCHAR, TEXT, CLOB and UNICODE equivalents

Number INTEGER, DECIMAL, NUMBER, REAL, FLOAT, DOUBLE PRECISION, etc.

Date DATE, TIMESTAMP, DATETIME

Time Time, TIMESTAMP, DATETIME

Timestamp TIMESTAMP, DATETIME, SMALLDATETIME

Container, Calculation, Summary N/A

FileMaker Konferenz 2010

MSSQL Server DATETIME Columns

MSSQL Server DATETIME column*

• You may override its type to DATE

• FileMaker assumes all data in the DATETIME column has a time of 12:00AM (midnight).

• This is SQL Server’s default behavior for DATETIME values with no time (e.g. ‘2009-08-16’ implies midnight on that date).

• You may override its type to TIME

• FileMaker assumes all data in the DATETIME column has a date of January 1, 1900

• This is SQL Server’s default behavior for DATETIME values with no date (e.g. ‘13:27’ implies a date of 1900-01-01).

• This is only an option. You can always leave DATETIME columns as Timestamp fields.

* New to FileMaker 10

FileMaker Konferenz 2010

Unsupported Data Types

SQL DBMS Data types

OracleAll BINARY types including BLOBs All INTERVAL types LONG

SQL ServerAll BINARY typesXMLLimited SMALLDATETIME support

MySQL All BINARY typesBLOBs

FileMaker Konferenz 2010

Agenda

• What is ESS?

• SQL Database Intro

• Basic ESS Architecture

• ESS from the Inside

• Performance Considerations

• Best Practices

FileMaker Konferenz 2010

First, a little background…

FileMaker Konferenz 2010

FileMaker Table Basics• Every record in a FileMaker table has a unique

Record ID (RID)

• Every table has a Master Record List

• List of RIDs in table

• Represented as a sparse bitmap

• Backed by client-side temporary file

• New records get the next available RID

• Deleting records

• Removes the data from the table

• Removes the RID from the Master Record List

FileMaker Konferenz 2010

Now back to our regularly

scheduled programming…

FileMaker Konferenz 2010

ESS mapping

How does it really work?

• Primary keys are…um…“key”.

• ESS maintains a mapping of PKRID.

SQLDB

PKsFileMakerDB EngineRIDs

SKU1 RID45

SKU2 RID12

SKU3 RID39PKs RIDs

FileMaker Konferenz 2010

Master RID list creation• Every table has a RID list representing every

row in the table

• Created when table is first opened

• RID list is allocated based on number of rows in table

select count(*) from ess_table

• RID list creation is very fast

• 1,000,000 row table opened in < 1 second

• No PK’s are retrieved.

FileMaker Konferenz 2010

Mapping Maintenance

• Mapping is maintained passively

• During Refresh Window command (menu or script step)

• When data for a RID cannot be found

• Data fetch (display, calc, sort, etc)

• Join

• Find

• During add and edit operations

FileMaker Konferenz 2010

Simple Mapping Scenario

Open layout on table

FileMaker Konferenz 2010

Simple Mapping Scenario (cont.)

• User opens a layout using an ESS table

• Connect to SQL database

• Query number of rows in table (20,000 rows)

select count(*) from rows20000

• Allocate RID list (20,000 RIDs)

• Connection remains open until file is closed

FileMaker Konferenz 2010

Simple Mapping Scenario (cont.)• FileMaker Pro requests 25 records for display

• RID request (1-25), mapping is empty

• Fetch primary keys for RIDs 1-25

select key1 from rows20000 order by 1

• Generate mappings for RIDs 1-25 to fetched PK’s

RID 1 100001

RID 2 100002

RID 25 100025

• Fetch data for RIDs 1-25 (uses key mapping to determine primary keys)

FileMaker Konferenz 2010

Simple Mapping Scenario (cont.)• User scrolls to 5,000th record

• Data requested (RIDs 5,000-5,024)

• Fetch primary keys for RIDs 26-5024select key1 from rows20000 where key1 > 100025 order by 1

• Map key1 values to RIDs

RID 26 100026

RID 27 100027

RID 5024 105024

• Fetch data for RIDs 5000-5024

select key1, d1, d2, d3 from rows20000where key1 in(105000,105001,…,105024)

FileMaker Konferenz 2010

Simple Mapping Scenario

RID PK

1 100001

2 100002

… …

5023 105023

5024 105024

5025

5026

… …

20000

RID - PK Mapping

FileMaker Konferenz 2010

Rows deleted from SQL DB (Scenario Continues…)

Note that we currently have mappings for RIDs 1-5024 and none for RIDs 5025-20000.• External to FileMaker Rows deleted: 100001, 100002 Rows deleted: 106000-106099 (total of 102 rows)

• FileMaker user performs Refresh Window • Refresh Window flushes cached data and

causes Master RID list to be updated.• The mapping remains untouched

FileMaker Konferenz 2010

Rows deleted from SQL DB (Scenario Continues…)

• How many rows are now in the table?select count(*) from rows20000

• There are now 19898 rows in the table.• There are many RIDs in the Master RID list with no

mappings, so they are the first to go.• RIDs 19898 through 20000 are removed from

RID listRID 1 100001 Still in mapping!RID 2 100002 Still in mapping!…RID 5024 $ 1005024RID 5025 through RID 19898 are still unmapped

FileMaker Konferenz 2010

Rows deleted from SQL DB(Scenario Continues…)

• Mapping

• RID 1 100001 Still in mapping!

• RID 2 100002 Still in mapping!

• …

• RID 5024 1005024

• RID 5025 through RID 19898 are still unmapped

• Master RID list:

• 1, 2, 3, …, 19898

FileMaker Konferenz 2010

Rows deleted from SQL DB (Continued…)

• User scrolls to first record, and refresh cleared the data cache, so… FileMaker now requests data for RIDs 1-25

select key1, d1, d2, d3 from rows20000 where key1 in(100001,100002,…,100025)

• No values returned for 100001 or 100002 • Consult mapping to find RIDs…• Update Master RID list

3, 4, 5, …, 19898

FileMaker Konferenz 2010

The Bottom Line

• Scrolling requires lots of maintenance

• There’s a great deal of work going on under the covers

• Aren’t you glad you don’t have to worry about this?

FileMaker Konferenz 2010

Performance Considerations

• Finds

• Joins

• Scrolling / Browsing

• Value Lists

• General

FileMaker Konferenz 2010

Finds

• Design goal: Mimic FileMaker

• “Find” operations converted to SQL to the extent possible.

• Push as much as possible into the SQL database.

NB: FileMaker post-processes results if necessary.

FileMaker Konferenz 2010

Finds: Numeric and Date• Numeric

• Supported operations

• Simple value matches: =, <, >, <=, >=, !

• Range (87..1006)

• Unsupported operations

• Wildcard (e.g. 3##3)

• Date/Time/Timestamp

• Supported operations

• Simple value matches: =, <, >, <=, >=, !

• Range (e.g. 1/1/2000...1/15/2000)

• Wildcard(e.g. */21/2007 10:*:* AM)

FileMaker Konferenz 2010

Finds: Text• ESS fields’ case sensitivity depends on database

collation for the column

• Usually a default for the database

• Can be overridden on a column-by-column basis in some SQL DBs.

• Most ESS searches are not word-based

• Find “<per” will match “pencils, wooden” But not “wooden pencils”

• “=” and default searches are word-based

• Find “pen” will match “pen, ink” and “ink pen”

• Exact matches are field based (as expected)

FileMaker Konferenz 2010

Finds: Text

Remember, FileMaker’s default behavior is word search

This is VERY expensive to do in SQL.

FileMaker Konferenz 2010

Some finds evaluated completely in SQL

• Numeric finds converted to simple predicatesFileMaker SQL

=32 where col=3243..97 where col>=43 and col<=97==Apples where col='Apples'

FileMaker SQL

=6/1/2007 10:00:00 tscol=datetime'2007-06-01 10:00:00'

1/*/2007 tscol>=datetime'2007-01-01 00:00:00' and tscol<datetime'2007-02-01 00:00:00'

• Date, Time and Timestamp converted using SQL literals or functions for date/time values

FileMaker Konferenz 2010

Some finds are post-processed

• Text word searches

• FileMaker’s wildcard search is richer than SQL’s

• SQL wildcards are generated, but will generally return more data than will really match

• FileMaker always post-processes these patterns

This can be VERY expensive.

FileMaker SQL

=pen where col like '%pen%'

=pen* where col like '%pen%'

(Any other wildcard search) SQL to get a constrained superset of rows matching expression

FileMaker Konferenz 2010

Find Performance✘ Avoid post-processed finds

✘ Pattern matches on text fields

✘ Word searches on text fields

✘ Non-leading date/time wildcard searches

Time values are always sorted YYYY/MM/DD HH:MM:SS

So…

“12/6/* 10:00:00” performs poorly because year is wild

“*:15:*” performs poorly because hour is wild

FileMaker Konferenz 2010

Join Performance• Single field equi-joins (equals predicate) are

preferredSQL is much more succinctselect pkkey from ess_tablewhere join_column in(v1, v2, …, vn)

instead ofselect pkkey from ess_table where (column1 = v1 and column2 = u1) or (column1 = v2 and column2 = u2) or (column1 = v2 and column2 = u3) …

FileMaker Konferenz 2010

Join Performance• Try to avoid large joins

Filter/find first and then join• Make sure ESS column(s) have appropriate

SQL indexes

NB: Index should cover at least all joined columns

FileMaker Konferenz 2010

Browsing Performance• Table-based browsing should be avoided

• Large table performance can be expensive• All intervening key values must be fetched• Mappings must be updated and maintained

• Order of rows is not necessarily predictable• Rows from FileMaker-based tables

generally appear in the order in which they were inserted

• ESS-based rows appear in the order in which their keys were fetched

• Browse on found sets instead• Hide the status area

FileMaker Konferenz 2010

Value Lists

• Version 10 and later supports value lists on ESS fields

• All functionality is available (no technical restrictions)

• For best performance avoid having ESS table as the secondary field unless it is the same table as the primary field.

• Performance of single field value lists generally not a problem.

FileMaker Konferenz 2010

General Performance• FileMaker does all sorting of data

• Fetches all records in found set

• Avoid sorting large quantities of data

• FileMaker does all summaries (count, sum, average, etc.)

• Fetches all records in found set

• Avoid summarizing large quantities of data

• Consider creating SQL views with aggregations

• Delete shadow fields that are never used from shadow table

• Caveat: Keep fields that tell you a record has been modified.

FileMaker Konferenz 2010

Best Practices• Don’t allow browsing of large tables.

• Find target set and then browse.

• Hide status area.

• Don’t sort or summarize large numbers of rows.

• Use SQL database’s native tools for loading and exporting large amounts of data.

• Use command-line tools from FM scripts where possible

• Or use Execute SQL script step

• Try to optimize “find” operations to avoid expensive post-processing.

FileMaker Konferenz 2010

Best Practices• Good relational database design is crucial to

good performance• Normalization• Surrogate keys• Good indexes• Discrete primary key types (not floating

point, or large text fields)

FileMaker Konferenz 2010

Questions?

FileMaker Konferenz 2010

Thank You!