[Rakuten TechConf2014] [C-5] Ichiba Architecture on ExaLogic

Preview:

DESCRIPTION

Rakuten Technology Conference 2014 "Ichiba Architecture on ExaLogic" Euncheol Kweon, Ryu Watanabe (Rakuten)

Citation preview

ICHIBA Architecture

On Exalogic Oct/25/2014

Euncheol Kweon / Watanabe Ryu

EC Core Platform Group ICE Project Team.

1

Introduction

Middleware Architecture

Life Cycle

Cost Effective Operation

Exalogic Operation On OZ Manager

2

About This Presentation

1. Exalogic Architecture

2. Exalogic Operation On OZ Manager

- Euncheol Kweon

- Watanabe Ryu

3

Introduction

Middleware Architecture

Life Cycle

Cost Effective Operation

Exalogic Operation On OZ Manager

4

No.1 EC Site in Japan

5

No.1 EC Site in Japan

6

Biz App

Biz API

Biz API

Biz App

Biz APIBiz App

Biz App

7

Physical Servers

Database

Middleware

Business Application

Network

Service Components

Operation

8

Service A Service BService C

Service D

Service E

Service F Service G

Many Servers for Service

Many Engineers for Operation

9

Heavy LowHeavy

High

High

Heavy Low

Many Engineers for Operation

Operational Efficiency

10

0

1

2

3

4

5

6

00:0

001:0

002:0

003:0

004:0

005:0

006:0

007:0

008:0

009:0

010:0

011:0

012:0

013:0

014:0

015

:00

16:0

017:0

018:0

019:0

020:0

021

:00

22:0

023:0

0

A Service

B Service

Service Peak Time

System

Load

Time

11

Heavy

Low

Middle

If we can use servers more effectively

High

12

If it can be operated by 4 person.

13

14

What is Exalogic?

Oracle Engineered System

Specialized Product

for Middleware- WebLogic

- Coherence

- Tuxedo

Total Solution

15

Exalogic Elastic Cloud

Software

Exabus

Oracle Traffic Director

Storage Management

Software

Operating System

Exalogic Control

VM Server

Hardware

Computer Nodes- Max 30 Nodes

Xeon 2.7 8Cores * 2

256 GB Memory

200 GB SSD

Integrated Storage- ZFS Storage Appliance

60 TB Storage

InfiniBand Switches- 40Gb/s throughput

16

Migration

Operation Cost Down

Non-Stop Operation

Our Plan for Exalogic

Japan ICHIBA

Over 50 Services

About 400Servers

17

Introduction

Middleware Architecture

Life Cycle

Cost Effective Operation

Exalogic Operation On OZ Manager

18

Exalogic

Optional

Middleware Structure

WebLogic

Coherence

OTD

Jennifer

19

OTD (Oracle Traffic Director)

OTDApplication A:XXX1

Instance 01 : XX.XX.0.1:0000

Instance 02 : XX.XX.0.2:0000

Instance 03 : XX.XX.0.3:0000

Instance 04 : XX.XX.0.4:0000

Application B:XXX2

Application C:XXX3

Application D:XXX4

Application B

Application C

Application D

Application A

Instance 01

Instance 02

Instance 03

Instance 04

20

Business Applications

WebLogic (Application Server)

Java

Fusion Middleware Installation - 11g

- 12c

bin

config

init-info

lib

pending

security

servers

startWebLogic.sh

stopWebLogic.sh

tmp

WebLogic Domain

Instance 02

- JRockit

- HotSpot

Instance 01

Instance 03

Instance 04

Exalogic Optimized

21

Coherence Cluster(having DAO)

Coherence (Data Grid Layer)

Java - JRockit

- HotSpot

Fusion Middleware Installation - 12c

DatabaseBusiness

Logic

Session

Object

22

Jennifer (Real Time Monitoring)

Application On Middleware

Instance 01

Instance 02

Instance 03

Instance 04

Jennifer ServerProcess Time

Method Profiling

System Status

IO Status

VM Status

23

System Design On Exalogic

24

Nonstop

Automated

Standardized

Design Keywords

25

Shared Resources

JavaMiddleware

Coherence Cluster

WebLogic Domain

ConfigurationOperation Tools

Application Logs

mount

mount

【 ZFS】

26

- bin

- config

- lib

- startWebLogic.sh

- stopWebLogic.sh

- servers

- instance 01/*

- instance 02/*

- instance 03/*

- log

- instance 01/*

- instance 02/*

- instance 03/*

SeparatedParameter for Private Resource

${weblogic.Name}

${tangosol.coherence.member}

ex) log, data

Separated Private Resources

Application On Middleware

Instance 01

Instance 02

Instance 03

WebLogic Domain

Shared

27

Rule Based System Configuration

Machines

IP Zone

Exalogic

WebLogic

Coherence

OTDJennifer

Two Parameters for Construction

Ports

Directories

App Name

App Number

28

Migrating ManagedServer

WebLogic Domain

WebLogic is Running on Floating IP

Instance 02 (10.10.1.2)

Instance 01 (10.10.1.1)

Instance 03 (10.10.1.3)

Instance 04 (10.10.1.4)

FailedInstance 01 (10.10.1.1)

Instance 03 (10.10.1.3)

Instance 04 (10.10.1.4)

Instance 02 (10.10.1.2)

Shutdown Moving IP Startup

29

Standardized Startup Scripts

- server_startup- startServer.sh

- scripts

- addon.sh

- automation.sh

- options.sh

- properties.sh

- options.conf

Jennifer Ready Remote Control

Customizing

Process Locking Log & Backup

JMX Ready

- Coherence Server / CUI Client

- WebLogic Admin & Managed Server

30

WebLogic Domain

nodemanager.properties

startScriptName=startServer.sh

….

….

….

NodeManager

NodeManager and Startup Scripts

- server_startup- startServer.sh

- scripts

- addon.sh

- automation.sh

- options.sh

- properties.sh

- options.confManagedServer

ManagedServer

AdminServer

31

Non-Stop Operation

32

Listener

Listen-Port : 8001

Main Feature for Nonstop Operation

OTD : rakuten.co.jp

Origin Server Pool

10.10.1.1:7003

10.10.1.2:7003

Origin Server Pool

10.10.2.1:7003

10.10.2.2:7003

33

10.10.1.1 : 7003

10.10.1.2 : 7003

OTD Configuration for Application

Listener

OTD : rakuten.co.jp

default-route : 8001

test-route : 9001

Origin Server Pool

10.10.2.1 : 7003

10.10.2.2 : 7003

34

A Domain

ManagedServer

Dual Domains for Application

B Domain

ManagedServer

8001 9001

Origin Server Pool B

10.10.1.1 : 7003

10.10.1.2 : 7003

Origin Server Pool A

10.10.2.1 : 7003

10.10.2.2 : 7003

35

OTD1.0

1.1OTD

1.0

1.1

Basic Concept

- Next Version Release

- Testing via Proxy

- Switching

- Current Service

- 20 Seconds

default-route

test-route

OTD1.0

0.9

- Current Service

36

6Hours

20Secs

Operating Time for Service Release

37

Sessionful Application

A Domaincoherence*web

B Domaincoherence*web

Getting Session Data

from Coherence Cluster

38

Application Package

Construction

WebLogic Domains

Jennifer Servers

1~2 WeeksOn Legacy Environment

A B

39

WebLogic Template

Automated

Tool

Jennifer Template

Application Name

Application Number

END Service INDeploy

Construction

40

Construction

5 MinutesOn Exalogic

Application Package

WebLogic Domains

Jennifer Servers A B

41

Introduction

Middleware Architecture

Life Cycle

Cost Effective Operation

Exalogic Operation On OZ Manager

42

Release

Switching

TestingOperation

43

Release

Switching

TestingOperation

44

root/domain{A,B}

- app/foo-1.0.1.war

- configuration/foo.properties

- library/lib-0.2.2.jar

/root/domain{A,B}

- app/foo-1.0.0.war

- configuration/foo.properties

- library/lib-0.2.1.jar

Service Domain

Release Material

Release Material having Same Path

maven-assembly-plugin

45

Recipe for Release

Release Material

Release History Directory

Release Material

Standby Domain

Updating New or Modified Resource

Latest Release

46

Release Work Flow

New

Deployment

Release

FilesUndeployment

Restarting

WebLogic

47

Switching

Operation

Release

Testing

48

OTD Standby Domain

Testing Application via Proxy Server

intranet

Proxy

Service Domain

49

Problem Solving by Repetitive Testing

Updating

ApplicationTesting

Bug Fix

50

Operation

Release

Switching

Testing

51

Standby Domain

Real Time Service Switching

Service Domain

52

Release History Directory

root.20141025

Creating Operation Directory

Release Material

Operation Directory

${applicationName}/operation/root

renaming

link

53

Release

Testing

Switching

Operation

54

Service Domain

Standby Domain

Rollback to Previous Version

Release History Directory

Release Directory of Previous Version

If standby side is updated.. Release

Switching

55

ReleaseModifying Configuration

Normal Operation

Operation Directory

${applicationName}/operation/root

Service Domain

Standby Domain

56

Introduction

Middleware Architecture

Life Cycle

Cost Effective Operation

Exalogic Operation On OZ Manager

57

WLST Operation Tools

Start / Shutdown

Deployment

Moving Instance

Undeployment

Getting Information

Modifying Configuration

58

Legacy System

Operation Documents

By System

By Material

By Time

A Reason Of High Operating Cost

Release Document

Release Document

Release Document

59

Exalogic

Operation Documents

Construction

Release

Switching

Operation

Recipe

Operation Recipe for All Application

60

Recipe Execution by Trigger

Trigger

Sequential Execution

Automatic Execution

Recipe

Arguments

61

Trigger System

62

Introduction

Middleware Architecture

Life Cycle

Cost Effective Operation

Exalogic Operation On OZ Manager

63

What is OZ Manager ?

64

Application Management Tool

scratchbuild

65

Why scratchbuild ?

66

because it’s Rakuten Ichiba

67

How is Rakuten Ichiba ?

Lots of teams

Lots of applications

Lots of instances

Lots of releases

68

How is Rakuten Ichiba ?

Lots of teams

Lots of applications

Lots of instances

Lots of releases

“team reliant operations”

more than

30teams

Diff on Rule

Diff on Judge

Diff on Procedure

69

How is Rakuten Ichiba ?

Lots of teams

Lots of applications

Lots of instances

Lots of releases

more than

150applications

“scattered apps & tools”

Diff on Language

Diff on Architecture

Diff on Monitoring

70

How is Rakuten Ichiba ?

Lots of teams

Lots of applications

Lots of instances

Lots of releases

more than

3000instances

“excessive alert mails”

10,000 mails on trouble

receivable only 700

Important mails get buried

71

How is Rakuten Ichiba ?

Lots of teams

Lots of applications

Lots of instances

Lots of releases

more than

1000releases in a year

“more trouble risks”

Required to attract

Bug occurs

Human error happens

72

What we needed

standardize basic operations

integrate and portalize management tools

collect log and pack alert mails

simplify and clarify for fast detection

73

OZ Manager

74

OZ Manager

1. Brief Overview

2. Plug-in configuration

3. Data relation

4. Monitoring for Exalogic

5. Demonstration

6. Future plan for Exalogic

75

OZ Manager

1. Brief Overview

Service Applications

Monitoring Tools

Log Collect Tools

LogCollector

operator

76

2. Plug-in configuration

plug-in manager

Checker Log Collect Application other

• Get status

• Stop

• Start

• Update

• Get results

• Get status

• Stop

• Start

• Update

• Get logs

• Get status

• Stop

• Start

77

nag

ios

Ex

alo

gic

logcollector

pan

do

ra

nagios

2. Plug-in configuration

plug-in manager

web

log

ic

Checker Log Collect Application

mo

du

le

log

ch

k

basic

log

perf

ch

k

perf

log

qp

slo

g

otd

orig

inch

k

CP

U %

ch

k

log

co

llecto

rad

min

ad

min

other

Ex

alo

gic

serv

ice

78

Host Application

exalogic

Cluster

rpage rpageA rpageA01 node1

rpageA02

rpageB01 node2

rpageB02

rpageB

orderapi

rpageA rpage-inst nodeX

Controlgroup

Filter Checker

error applog inst-chk cpu%

3. Data relation

Link

jennifer

exalogic

Component & Control

79

3. Data Relatoin

Authentication & Authorizatoin

User

Ryu

Usergroup

admin

Menu

User

Cluster

exalogic

Mall

Kweon-san

EC Core

Cluster Application Host

rpage

orderapi

userX

80

4. Monitoring for Exalogic

CPU usage

Load average

Filesystem usage

Burst process

Host - each node

Type Target Checker

Cluster

Weblogic service A/B

• Application log

• Performance log

• Alert log

Coherence A/B• Application log

• Stdout log

Application

system

Oracle EM• Admin server process

• Client agent process

Nodemanager • Process for each version

OTD • Process for each OTD configuration

weblogic instances • OTD origin online check

coherence instances • Coherence instance check

Host Each nodes

• Ping check

• Cpu usage

• Load average

• Filesystem usage

• Burst process

81

5. Demonstration

Demo Video

Basic Usage

82

5. Demonstration

Demo Video

Maintenance Tool

83

5. Demonstration

Demo Video

Command line Interface

84

5. Demonstration

Demo Video

Exalogic Special Features

85

6. Future plan for Exalogic

AB service side switch

instance location list

Instance relocation

Exalogic-coherence plugin

86

Operation Zero

87