Upload
dmitry-fedorov
View
53
Download
0
Tags:
Embed Size (px)
Citation preview
Wont talk about
(Lots of talks about it allready)
This talk will not include:
Marketing stories
Docker is awesome...blah,blah,blah
Namespaces
He's back. And this time he's got a chainsaw.
Yes, folks. We got per-process namespaces. Working. With properbehaviour on exit(), yodda, yodda. Enjoy.
Mount (Mount points)
UTS (Hostname and NIS domain name)
IPC (System V IPC, POSIX message queues)
PID (Process IDs)
Network (Network devices, stacks, ports, etc.)
User (User and group IDs)
Namespaces Api
/proc/[pid]/ns
ipc -> ipc:[4026531839]mnt -> mnt:[4026531840]net -> net:[4026531956]pid -> pid:[4026531836]user -> user:[4026531837]uts -> uts:[4026531838]
Syscalls:
clone(2)
setns(2)
unshare(2)
Mount namespaces
On hostnode:
cat /proc/1/mounts |wc -l32
Inside container:
docker run -it --rm centos:centos7 cat /proc/1/mounts | wc -l16
UTS namespaces
hostname, domainname
On hostnode:
uname -ndfedorov
Inside container:
docker run -it --rm centos:centos7 sh -c 'uname -n'b543e1bb6eef
IPC namespaces
System V IPC objects, POSIX message queues
/proc/sys/fs/mqueue
/proc/sys/kernel
/proc/sysvipc
On hostnode:
dfedorov@dfedorov:~$ ipcs | wc -l45
Inside container:
dfedorov@dfedorov:~$ docker run -it --rm centos:centos7 ipcs |wc -l10
PID namespaces
process ID number space
Nesting namespace:
PID namespaces can be nested: each PID namespace has a parent,except for the initial ("root") PID namespace.
On hostnode:
ps aux|wc -l298
Inside container:
docker run -it --rm centos:centos7 ps axu |wc -l2
Network namespaces
network devices, IPv4 and IPv6 protocol stacks, IP routing tables, firewalls
/proc/net
/sys/class/net
/sys/class/net on hostnode:
docker0 eth0 lo lxcbr0 veth1 veth50cf98d veth6b9c9cc
Inside container:
docker run -it --rm centos:centos7 ls /sys/class/net
eth0 lo
Network namespaces
Create netns manualy:
ip netns add minimal # Create namespaceip link add eth1 type veth peer name veth1 # Create virtual ethernet deviceip link set eth1 netns minimal # Attach device to namespaceip a add 10.0.0.1/24 dev veth1ip l set veth1 up
User namespaces
user credentials (user IDs and group IDs), capabilities
Still strict user mapping. Sad ...
UID 1000 inside container -> 1000 on hostnodeUID 0 inside container -> 0 on hostnodeetc
And dont really work ...
sudo ls -l /proc/1/ns/userlrwxrwxrwx 1 root root 0 Nov 28 14:17 /proc/1/ns/user -> user:[4026531837]
docker run -it --rm centos:centos7 ls -l /proc/1/ns/userlrwxrwxrwx 1 root root 0 Nov 28 11:18 /proc/1/ns/user -> user:[4026531837]
User namespaces - Capabilities
per-thread attribute
Used caplist:
CHOWNDAC_OVERRIDEFOWNERMKNODNET_RAWSETGIDSETUIDSETFCAPSETPCAPNET_BIND_SERVICESYS_CHROOTKILL
troublesome: mount (cap_sys_admin)
Efficiency
isolated, but still on hostnode
cpu: native
memory: allmost native, few % shaved for accounting
network: small overhead
dics: native on volumes. overhead on layered fs
What do we need on top of all of it?
unionfs (aufs, vfs)
snapshotting fs (btrfs, zfs)
CoW (thin provisioning, lvm)
Docker Approach
Application level isolation vs OS level isolation.
One task per container.
Deduplication.
Commoditize.
Typical workflow
developer:
-- write some code -- unit test -- commit
docker build:
-- environment test (serverspec, rspec etc) -- functional test -- push to registry
devops:
-- pull images and run
Old-new challenges
Monitoring.
Logging.
Backups.
Configuration management.
No ssh, and you dont need it.
No ssh, but you have exec.