28
Putting your users in a Box Greg Thain Condor Week 2013

Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

Putting your users in a Box

Greg Thain

Condor Week 2013

Page 2: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› Why put job in a box?

› Old boxes that work everywhere* » *Everywhere that isn’t Windows

› New shiny boxes

2

Outline

Page 3: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

1) Protect the machine from the job.

2) Protect the job from the machine.

3) Protect one job from another.

3 Protections

3

Page 4: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› Allows nesting

› Need not require root

› Can’t be broken out of

› Portable to all OSes

› Allows full management:

Creation // Destruction

Monitoring

Limiting

The perfect box

4

Page 5: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› Resources a job can (ab)use

CPU

Memory

Disk

Signals

Network.

A Job ain’t nothing but work

5

Page 6: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› HTCondor Preempt expression

PREEMPT = TARGET.MemoryUsage > threshold

• ProportionalSetSizeKb > threshold

› setrlimit call

USER_JOB_WRAPPER

STARTER_RLIMIT_AS

Previous Solutions

6

Page 7: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› Newish stuff

From here on out…

7

Page 8: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› Some people see this problem, and say

› “I know, we’ll use a Virtual Machine”

The Big Hammer

8

Page 9: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› Might need hypervisor installed

The right hypervisor (the right Version…)

› Need to keep full OS image maintained

› Difficult to debug

› Hard to federate

› Just too heavyweight

Problems with VMs

9

Page 10: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› Want opaque box

› Much LXC work applicable here

› Work with Best feature of HTCondor ever?

Containers, not VMs

10

Page 11: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› ASSIGN_CPU_AFFINITY=true

› Now works with dynamic slots

› Need not be root

› Any Linux version

Only limits the job

CPU AFFINITY

11

Page 12: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› You can’t kill what you can’t see

› Requirements:

HTCondor 7.9.4+

RHEL 6

USE_PID_NAMESPACES = true

• (off by default)

Doesn’t work with privsep

Must be root

PID namespaces

12

Page 13: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

PID Namespaces

13

Init (1)

Master (pid 15)

Startd (pid 26)

Starter (pid 39)

Job (pid 1)

Starter (pid 73)

Job (pid 1)

Page 14: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› “Lock the kids in their room”

› Startd advertises set

›NAMED_CHROOT = /foo/R1,/foo/R2

› Job picks one:

›+RequestedChroot = “/foo/R1”

› Make sure path is secure!

Named Chroots

14

Page 15: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› Two basic kernel abstractions:

› 1) nested groups of processes

› 2) “controllers” which limit resources

Control Groups

aka “cgroups”

15

Page 16: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› Implemented as filesystem

Mounted on /sys/fs/cgroup, or /cgroup or …

› User-space tools in flux

Systemd

Cgservice

› /proc/self/cgroup

Control Cgroup setup

16

Page 17: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› Cpu

› Memory

› freezer

Cgroup controllers

17

Page 18: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› Requires:

RHEL6

HTCondor 7.9.5+

Rootly condor

No privsep

BASE_CGROUP=htcondor

And… cgroup fs mounted…

Enabling cgroups

18

Page 19: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› Starter puts each job into own cgroup

Named exec_dir + job id

› Procd monitors

Procd freezes and kills atomically

› MEMORY attr into memory controller

› CGROUP_MEMORY_LIMIT_POLICY

Hard or soft

Job goes on hold with specific message

Cgroups

19

Page 20: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

Cgroup artifacts

20

04/22/13 11:39:08 Requesting cgroup

htcondor/condor_exec_slot1@localhost for job

StarterLog:

ProcLog

cgroup to htcondor/condor_exec_slot1@localhost for ProcFamily

2727.

04/22/13 11:39:13 : PROC_FAMILY_GET_USAGE

04/22/13 11:39:13 : gathering usage data for family with root

pid 2724

04/22/13 11:39:17 : PROC_FAMILY_GET_USAGE

04/22/13 11:39:17 : gathering usage

Page 21: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

$ condor_q

-- Submitter: localhost : <127.0.0.1:58873> : localhost

ID OWNER SUBMITTED RUN_TIME ST PRI SIZE

CMD

2.0 gthain 4/22 11:36 0+00:00:02 R 0 0.0 sleep 3600

›$ ps ax | grep 3600 gthain 2727 4268 4880 condor_exec.exe 3600

21

Page 22: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

$ cat /proc/2727/cgroup

3:freezer:/htcondor/condor_exec_slot1@localhost

2:memory:/htcondor/condor_exec_slot1@localhost

1:cpuacct,cpu:/htcondor/condor_exec_slot1@localho

st

A process with Cgroups

22

Page 23: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

$ cd

/sys/fs/cgroup/memory/htcondor/condor_exec_sl

ot1@localhost/

$ cat memory.usage_in_bytes

258048

$ cat tasks

2727

23

Page 24: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› Or, “Shared subtrees”

› Goal: protect /tmp from shared jobs

› Requires

Condor 7.9.4+

RHEL 5

Doesn’t work with privsep

HTCondor must be running as root

MOUNT_UNDER_SCRATCH = /tmp,/var/tmp

MOUNT_UNDER_SCRATCH

24

Page 25: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

MOUNT_UNDER_SCRATCH=/tmp,/var/tmp

Each job sees private /tmp, /var/tmp

Downsides:

No sharing of files in /tmp

MOUNT_UNDER_SCRATCH

25

Page 26: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› Per job FUSE and other mounts?

› non-root namespaces

Future work

26

Page 27: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› Prevent jobs from messing with everyone

on the network:

› See Lark and SDN talks Thursday at 11

Not covered in this talk

27

Page 28: Putting your users in a Box · Outline . 1)Protect the machine from the job. 2)Protect the job from the machine. 3)Protect one job from another. 3 Protections 3 ... ›Much LXC work

› Questions?

› See cgroup reference material in kernel doc • https://www.kernel.org/doc/Documentation/cgroups/

cgroups.txt

› LKN article about shared subtree mounts: • http://lwn.net/Articles/159077/

Conclusion

28