Seminar 04A - Care and Feeding of the Institutional
Directory Service
Advanced Issues, Problems, and Solutions
Presented by Brendan Bellina and Rob Banz
October 9, 2006
Overview• Speaker Introductions• Overview of Enterprise Directory Models and
implemented systems at USC and UMBC• Data Transport• Directory Schema Design
– Directory Information Tree (dn format, depth)– People– Accounts– Groups– Permissions– Standard Object Classes– Schema extensions (Get your OID on!)
Overview (cont.)• Controlling Access• Monitoring Performance• Directory Administration Tools• Directory Replication and Synchronization• Authentication Services• Authorization Services• Managing Attribute Release
– Service Accounts– Shibboleth– Provisioning– Federalizing
Overview (cont.)
• Directory Team Staffing
• Additional Issues to Consider
• Future Advancements
• Institutional Policies
• Inter-institutional Collaborative Resources
• Questions
Brendan BellinaIdentity Services Architect, USC
• Background in Financial Software Development and Data Warehouse Design
• Active in Higher-Education Identity Management / Directory Services since 2001
• Designed and implemented the Enterprise Directory Service at the University of Notre Dame (2001-2004) http://eds.nd.edu
• Architect of USC Global Directory Service (2005-current) http://www.usc.edu/gds
• Presentations and online materials available at http://its.usc.edu/~bbellina
Rob Banz, UMBC
• Managing the Core Systems group at UMBC.• Background in UNIX systems engineering and
software architecture.• Likes making things work together that aren’t
supposed to…• Architect of UMBC’s Enterprise Directory / IDMS
( 2000 - present )• Presentations available at
http://umbc.edu/~banz
Data Collection Multiple internal Systems of Record
Data Migration Metadirectory scripts
Phased Implementation Organizational Units (People, Accounts, Groups, Courses, etc.)
Data Restrictions LDAP Access Controls, Shibboleth ARP’s
Data Access Designed for High-volume read, low-value write.
Applications, End-users, Application/NOS directories
Enterprise Directory Architectures
• Centralized EDS– Everything queries the central EDS– Central control– Performance bottleneck risk
• Replicated EDS– Replicate servers for performance– Data Latency
• Derivative directories– Distribute EDS data to stand-alone directories– Issues managing identities– Risk of data leakage and inconsistent access controls
• Isolated directories– Isolated user stores (ugh!)
Initial Implementation Plan• Production Hardware
– Redundancy– Security– Scalability– Monitoring– Availability– Performance– Recoverability– etc.
• Integrated Test/Development System(s)– Pre and Post Production Systems– Crash and Burn system
UMBC Server Architecture
• Design with DR in mind!– Mirrored storage across datacenters for important
transactional data (registry, master directory, etc.)– Easy to bring up on similar hardware when the time
comes without losing changes
• Replicas– N+1. Be sure you can handle all of your transactions
if one is missing.– Hardware is cheap. Memory is cheap. Overbuild
now, and stay ahead of the curve.– Where’s the curve you ask? We’ll get to that.
UMBC Directory Architecture
• Future growth and projects– PeopleSoft Student Administration– Expand physical access control integration– Partial replication to OID (Oracle’s Directory)– Logging & Diagnostics
• …and others that I can’t imagine yet.
Batch Pros and Cons
• Periodic processes are easier to support
• Periodic processes are easier to update
• Batch processes allow looser integration testing
• Data Latency
• Performance spikes
• Effective delay of service
Real-Time Pros and Cons
• Shorter delays in processing
• Transactions are spread-out (generally) allowing smaller systems
• Like spam, it never stops
• Harder to test, support, and maintain
Data Extract Issues
• Codes in tables or Values in entries– Transaction systems often use codes– End services often require values. Standard LDAP
attributes are expected to be values.– Single changes in code tables may result in many
updates to values in entries– Values in entries alone may not provide enough
information for data selection
• Should directory be insulated from source system table structures?
USC Data Transport• Batch components
– Employee system updates– Sync between Account System and GDS– Sync between Person Registry and GDS– Rebuilding GDS groups and permissions
• Real-time components– Identity creation in Person Registry from Student
System– Identity creation in Person Registry from Employee
System– Identity creation from Guest/Affiliate “iVIP” system– Creation of Sponsored User Accounts “SASU”
UMBC Data Transport
• Real Time Components– Student Status / Enrollment Changes– ID Card issuance ( Mag Strip / Library Card #)– Self / Administrative initiated changes
• Identity creation for new faculty/staff *• Identity creation for affiliates (guests, etc.)• Account creation / activation• Directory information updates
– Back-feeds of CampusID data
• Batch Components– CMS (Blackboard) Course Creation– Faculty / Staff Identity Updates– Data feeds to Library
Directory Design Decisions To Be Made
• DIT – Tall or Flat• Lots of attributes (“thick”) or only identifiers (“thin”)• dn and rdn format• Direct or proxied update access• DS mastered content - entries & attributes• LDAP as password store• Duration / Permanence of directory entries and
identifiers• People vs. Accounts• Groups (subgroups, roles, dynamic groups, static
groups, managed groups, exceptions, personal groups, etc.)
DIT Architecture
Tall & Spiky Flat
ou=Academic
ou=Sciences ou=Arts & Letters
ou=Physics ou=Chemistry
ou=People ou=Groups
ou=Philosophy
Why not Tall & Spiky?
• Not amenable to people being in multiple organizational units simultaneously
• Not efficient when people move between organizational units frequently
• Not efficient when organizational hierarchy changes occur
Distinguished Name (dn) format• Issues
– Useful for LDAP enabled apps– Visible if any attribute in the entry is visible– Must be unique within scope– Benefits in being persistent, non-reassignable, and
opaque
• Standards– X.500 naming (based on geographical location)
• cn=Bullwinkle Moose, ou=people, o=Wossamotta U, st=Confusion, c=US
– Domain Component naming (most commonly used)• cn=Bullwinkle Moose, ou=people, dc=Wossamotta, dc=edu
• Relative Distinguished Name selection– uid, cn, directory id, or something else?
USC Decisions• dn: dc naming using unique directory id as rdn• Flat DIT. Thick entries.• Central authN/authZ “where possible”• Single system for identities - Person Registry• Registry is “Cradle to Grave” or “Womb to Tomb” eventually• Require use of service dn’s for LDAP-enabled applications• Passwords in Kerberos rather than LDAP where possible• Allow multiple accounts per person, but move to establish
“NetID” for enterprise services• Use of post-business-rule data source “signatures”• Directory contains people who receive or have received
electronic services• Neither Registry nor Directory provide reporting services• Groups for authorization, with group memberships and
authorizations reflected in member entries
USC Decisions (cont.)• People entries (ou=people)
– An entry is created for each identity in the Person Registry that requires electronic services. Entries may be deactivated when service ends, but never deleted.
– People entries may be publicly accessible via LDAP protocol to allow use with email clients.
– People entries have no credentials or login capability.– Example:
uscrdn=usc.edu.scbs5rm6,ou=people,dc=usc,dc=edu– http://eds.nd.edu/cgi-bin/nd_ldap_search.pl?ldapurl=g
ds-ldap.usc.edu:389/uscrdn%3Dusc.edu.scbs5rm6%2Cou%3Dpeople%2Cdc%3Dusc%2Cdc%3Dedu&ldapheadattr=displayname&displayformat=generic
USC Decisions (cont.)• Account entries (ou=accounts)
– An entry is created for each active enterprise Unix account. These are intended to be used only by Unix services. Entries may be deactivated when service ends, but never deleted.
– An “aggregate” account is created based on username for each set of Unix accounts a person has. Usually a person has a single aggregate account. This is intended to be used by Shibboleth and LDAP-enabled services.
– A “privilege” account is created for non-Unix services, is attached to a sponsor’s person entry, and is restricted to a single application. This can accommodate LDAP-enabled applications that use reserved account names - like “sa” or “admin” or provide limited access to services for non-people (like vendors).
– No account entries are visible publicly. They are visible to the owner.
– LDAP-enabled apps that construct dn CANNOT WORK– Example:
uscrdn=usc.edu.scdv5wtq6,ou=accounts,dc=usc,dc=edu
USC Decisions (cont.)• Group entries (ou=groups)
– Static groups are used rather than dynamic groups. Members of groups can be person or account entries, but not other groups.
– Groups may be rule-based. The rule is defined as an LDAP filter. Rule-based groups are reconstructed each business day.
– Groups may have any number of inclusion or exclusion groups that are applied to their membership. Inclusion and exclusion groups are manually administered. Groups that have dependencies on inclusion or exclusion groups are reconstructed each business day.
– Authorizations are controlled via groups. Shibboleth entitlements, eligibilities, and membership of a group are maintained in member attributes to facilitate use by Shibboleth, in group-math-based LDAP filters, and in directory access controls.
– No group entries are currently visible publicly, although it is possible for a group to be defined as public.
– Example: uscrdn=usc.edu.scmb9tg2,ou=groups,dc=usc,dc=edu
UMBC Decisions
• dc= naming(our public directory has it…)
• Registry has a long history (back to 1980 for students!)• Passwords in Kerberos, but synchronized to LDAP for
other uses.• Group membership *or* attribute definitions may
determine authorization. (e.g. affiliation=student makes you eligible for certain services such as a computer “account”)
• No “self modify” of entries
UMBC Decisions
• ou=People– You can “bind” (authenticate) as a person– Most applications are using a “person’s” rights for
authorization data (shibboleth, etc)– dn’s are opaque:
(guid=6cbfa31e-6e14-11d4-9669-8020cd7816,ou=people,…)
• ou=Accounts– You can “bind” as an account too!– dn’s aren’t opaque :(
(uid=banz,ou=accounts,…)
UMBC Decisions
• Groups– Appear in a few places in the DIT
• ou=Courses– Trees of groups for each semester containing course
enrollment. Used for lab access control, Blackboard course population, dynamic email lists, etc.
• ou=Applications– Application-specific group trees
• ou=Departments– Group trees for specific university departments
• ou=Radius– Groups used by our radius servers for VPN access
UMBC Decisions
• Two kinds of groups…– standard (groupofuniquenames)
• Used by external applications
– Extended (umbcgroupofuniquenames)• Used by internal applications• Can contain nested groups (internal applications know how
to grok them)
• Future?– These should/will both be replaced with groups
generated from
Standard Object ClassesUsed at USC:• top• person• organizationalPerson• inetOrgPerson• eduPerson• posixAccount• groupOfUniqueNames
Schema Extensions
• Step One: Get an OID assignment for your institution from IANA
• Step Two: Create new objectclasses for new attributes
• DO NOT make up or reuse an OID
• DO NOT modify a standard objectclass
• DO NOT populate standard attributes in non-standard ways
USC Schema Extensions• For all directory entries:
– uscDirectoryEntry objectclass
• For people entries:– uscEduPerson objectclass– uscMailRecipient objectclass
• For account entries– uscEduPerson objectclass– uscAccount objectclass
• For group entries– uscGroupEntry objectclass
uscDirectoryEntryobjectclasses: ( 1.3.6.1.4.1.13363.3.2.1 NAME 'uscDirectoryEntry' DESC 'USC Directory Entry Object Class' SUP top AUXILIARY MUST ( uscGuid $ uscRDN $ uscPvid $ createTimestamp ) MAY ( uscEntryNote $ uscEntryStatus $ uscEntryExpirationDate $ uscEntrySource $ uscEntryUsage $ uscEntryCategory $ uscEntryCreateDate $ uscEntryDeactivationDate $
uscEntryReleasePolicy $ uscAttributeReleasePolicy $ uscAuthEligible $ uscAuthEligibleDN $ uscEntrySignature $ uscHistoricalPvid $ uscOwnerPvid $ creatorsName $ modifyTimestamp $ modifiersName $ searchguide $ labeledURI $ owner $ description $ userPassword ) X-ORIGIN 'user defined' )
uscGroupEntryobjectclasses: ( 1.3.6.1.4.1.13363.3.2.5
NAME 'uscGroupEntry'
DESC 'USC uscGroupEntry Object Class'
SUP groupOfUniqueNames STRUCTURAL
MUST (
uscGroupType
)
MAY (
owner $
uscGroupMember $
uscGroupRule $
uscGroupRuleComponent $
uscGroupIncludeDN $
uscGroupExcludeDN $
uscGroupOptInDN $
uscGroupOptOutDN $
uscGroupSelfOptOut $
uscGroupEnrollmentType $
uscGroupCategory $
uscGroupLevel $
uscGroupOwner $
uscGroupOwnerProxy $
uscGroupManager $
uscGroupSponsor $
uscGroupMemberAuthEligible $
uscGroupMemberAuthEligibleDN $
uscGroupMembershipListVisibleToMembers $
uscGroupKeyword $
uscGroupIsNestable $
uscGroupUniqueMemberSignature $
uscGroupMembershipAttributeControl $
uscGroupExcludeOverrideDN $
uscGroupMemberEntitlement
)
X-ORIGIN 'user defined'
)
Directory Access• Direct access via LDAP/LDAPS
– Directory Access Control Lists / Instructions• Netscape / iPlanet / Sun uses ACI’s
# Allow all access to the Directory Administrators Groupaci: (targetattr ="*") (version 3.0;acl "Directory Administrators Group"; allow (all) (groupdn = "ldap:///cn=Directory Administrators,
dc=usc,dc=edu") ; )#
Access to an entry is based on attributes of the entry. Group membership is not an attribute unless you create one like isMemberOf and populate it.
Directory Access
• Proxied access– Shibboleth ARP’s
<AttributeReleasePolicy>
<Rule>
<Target>
<Requester>ServiceProvider</Requester>
</Target>
<Attribute name=“urn:attributeURN”>
<AnyValue release=“permit”|“deny” matchFunction=
“urn:functionURN”>attributeValue</Value>
</Attribute>
</Rule>
</AttributeReleasePolicy>
Directory Access
• Shibboleth Rule Constraint (USC authored patch for Shibboleth 1.3; included in 2.0)
<Constraint attributeName=“urn:attributeURN”
matchFunction=“urn:functionURN”
matches=”(any|all|none)”>value</Constraint>
This allows Shibboleth to restrict attribute release to a Service Provider based on attributes of the entry. This mimics the capabilities of Directory ACI’s.
Why Monitor Performance?• Availability It’s up… isn’t it?
Directories often require 7x24 availability• Responsiveness It’s fast enough… maybe.
Directories often require extremely fast response and can be affected by unplanned usage through public interfaces
• Scalability We can handle that… I think.
Structural changes, indices, increases # of entriesor # of attributes or # of queries may affect performance.
Metrics to Monitor
Response time
Connection requests
Bind requests
Bind errors
Search requests
Search errors
Avg count & size of search results
Directory Cache Hits
Directory Cache Tries
Bind response time
Search response time
Current connections
Avg connection length
Current binds
Current searches
# Bytes transmitted
# Entries transmitted
Methods to Monitor
• LDAP query
• LDAP log monitoring
• Directory Probing
• Existing Free Utilities– Orca - open source tool for monitoring OS– Look - LDAP operational Orca "k"ollector– Cacti - open source tool for monitoring OS via
SNMP
Directory Administration Tools
• USC– ACI’s and schema managed via LDIF– Perl admin scripts for querying, adding
attributes to entries, replacing attributes in entries, modifying group membership
– Plans to write a web utility for group creation and maintenance
– Plans to write a web utility to allow departments to manage group memberships
Directory Administration Tools
• UMBC– Sun One Directory Server Console
(it’s almost usable again…)
– Custom web applications for administration and self-service
– Perl, perl, perl. (and a little bit of java)
– Exploring Grouper & Signet for advanced group and role management
Replication / Synchronization
• Why?– Redundancy– Capacity– Isolation of heavy directory consumers
(e.g. mail servers query their own replicas)
• What?– The whole directory, or just some…
Replication / Synchronization
• Whole directory replicas…– “built in” to most modern directory servers
• OpenLDAP, Sun JDS, Active Directory, etc.
– Replicas are typically “read only”– Some support “multi-master” replication
• Some do it well (Active Directory), some not so well…
Replication / Synchronization
• Partial / Filtered Replication… Why?– An application may only need one subtree of
the DIT (e.g. mail routing)– A “white pages” directory with restricted
information (outside of a firewall)– An application may need to have information
represented in a “special” way (boo!)– An application may only work against “brand
y” directory and you have “brand x” (tsk tsk!)
Replication / Synchronization
• How ?– Some products have the ability to filter attributes
and/or subtrees.– Want to do something more complicated?
• Sun JDS has query-able “changelog” that can advise you when a directory object has been modified to trigger a synchronization operation
• UMBC does it’s real-time external feeds through monitoring for interesting changelog events.
• Write code to do something as simple as comparing/updating an object on a remote directory to transforming attributes, groups, etc…
Internal Authentication
• userPassword attribute
• Password can be encrypted using several encryption algorithms, although required compatibility with services may limit the choices
• LDAP bind operation
• LDAP compare operation
External Authentication
• Passing authentication to Kerberos– Free directory plug-ins for iPlanet/Sun
available• University of Notre Dame• Duke• USC (soon!)
USC Directory Enforced AuthZLDAP-enabled applications use ACI’s to constrain the application service account so that authorized user entries are accessible– Group defines the authorized users of the application.
Each member entry has eligibility attributes set - uscAuthEligible, uscAuthEligibleDN.
– Application is assigned a service account that is constrained by an ACI to see only the entries with the uscAuthEligibleDN value for the application.
– Uses wildcards to allow a single ACI to handle most constrained application service accounts.
– Use an ACI to prevent constrained apps from seeing publicly visible entries
Attribute Release• Consider impact of FERPA and HIPAA.
Make sure that applications are not passing data on to other applications or displaying protected data inappropriately.
• Keep track of what is released to who so that impacts are known when directory changes must be made
• Make it easy on yourself: default == deny.
Attribute Release MechanismsReleasing attributes via LDAP service accounts– Use service accounts and ACI’s to limit attribute release
to those applications that require it.– Can be used to retrieve attributes about any visible
entries.– Mapping to groups via LDAP may be used to reduce the
need to propagate group information to applications.
Releasing attributes via Shibboleth– Attributes for the user logging in are released. Shibboleth
normally not used to retrieve attributes about other users, groups, or other directory objects.
Attribute Release MechanismsProvisioning
– Directory log watcher used to provision to an external directory real-time
– Using signature attributes to facilitate provisioning of static groups
Federalization (via Shibboleth)– Releasing local attributes to remote Services– Releasing local attributes about remote guests
USC Directory Technical Team• 1 FTE Identity Services/Directory Architect• 1 FTE Sr Developer focused on Registry• 1 FTE Technical Analyst focused on Application
Integration and Shibboleth• 1 FTE Sr Application Developer• 1 FTE Developer• MIA - 1 FTE Sr Developer for Web Services• MIA - 1 FTE Sr Developer for Messaging
• MIA - Positions Missing in Action due to funding cuts• Server and Directory operations and support are managed
by resources in another department.
UMBC Directory Team
• 1 FTE IDMS / Integration Developer
• Partial FTE IDMS Architect
• … other intrepid souls in our department and others picking up tasks such as:• Identity Resolution issues• Managing identities of campus affiliates
Additional Issues to Consider• Not a “safe” career path
– Saying “no” in higher-ed is unhealthy. Even saying “no without data steward approval” is unhealthy when in central IT.
• Compatibility with all applications is not achievable– dn syntax– Use of service accounts– Use of Shibboleth
• Application Administrators are always a problem– Special accounts– Special privileges– Poorly managed
Future Advancements
• NMI Grouper– Groups Registry
• NMI Signet– Privileges Registry
• NMI Shibboleth 2.0– SAML 2.0 compliant web SSO product that
supports a federalized model and privacy protection
Institutional Policies• Release of data should require data steward approval
– Risk: They’ll stop giving you data
• Registry should not be used for reporting or end-user access– Risk: Access Controls between Registry and Directory may be
impossible to sync, so you may release data inappropriately. Performance may suffer.
• Access Controls should be in the directory– Risk: Applications will use whatever data they can get. Honey pots
of identity information will exist on department servers that are likely to be poorly managed and secured.
• Applications should not pass data to each other– Risk: Understanding of what apps are using what data will be lost.
Data stewards will lose trust and stop providing data. Cripples ability to make changes in the directory which could lead to being non-standard.
Institutional Policies• Authorization should be required
– Risk: Authentication alone forces applications to do authorization. This will cause problems when you expand the population of the directory. It also makes it impossible for an audit to determine who has what authorizations. It also requires the continued bad practice of data stewards shipping data to end-user apps.
• Know who is using what attributes– Risk: Directory changes are inevitable. If you do not know who is
using what you will be unable to make changes without knowing the impacts.
• Follow standards wherever possible– Risk: Following standards is the safest way to ensure
compatibility with the widest possible array of applications and services. If an application cannot use your enterprise directory they will build their own.
Institutional Policies• Applications should only be given persistent identifiers
that are never reissued– Risk: Applications may have different purge practices. The
reuse of identifiers risks a person getting inappropriate access to another person’s records.
• Anonymous access should not allow access to FERPA or otherwise private information– Risk: Privacy needs to be protected. In addition if a service tries
to use the public interface to the directory without approval for their application it will not work for FERPA and private people, which will eventually force them to seek appropriate approval.
• Do not delete entries from the Registry or Directory– Risk: Identifiers may be mistakenly reissued. A person returning
may not be recognized and given new identifiers. This means though that when people return privileges must be reexamined.
Inter-institutional Collaborative Resources
• MACE : Middleware Architecture Committee for Education
• MACE-Dir : MACE Directories Working Group
• NMI : National Middleware Initiative
• Internet2
• EDUCAUSE
Online Resources
• USC Global Directory Service,<http://www.usc.edu/gds>• UMBC Directory, <http://www.umbc.edu/oit>• Notre Dame Enterprise Directory Service, <http://eds.nd.edu>• eduPerson object class, <http://www.educause.edu/eduperson/>• Internet2 Middleware, <http://middleware.internet2.edu/>• ORCA software, <http://www.orcaware.com>• Look software, <http://middleware.internet2.edu/dir/look/>• Cacti, <http://www.cacti.net>• ND iDS Kerberos4/5 Plug-in, <http://eds.nd.edu/docs/authnplugin>• Duke iDS Auth Plug-in, <http://www.oit.duke.edu/~rob/krbdirp/>
Contact Information:
Brendan [email protected]
http://isd.usc.edu/~bbellina
http://umbc.edu/~banz
Copyright 2006 by Brendan Bellina and Rob Banz. This work is the intellectual property of the authors. Permission is granted for this material to be shared for non-commercial, educational purposes, provided that this copyright statement appears on the reproduced materials and notice is given that the copying is by permission of the authors. To disseminate otherwise or to republish requires written permission from the authors.