30
The 7 Lessons for Highly Effective Real-time Server Great Technology For Great Games DK Moon [email protected] 1

[아이펀팩토리] 2017 NDCP

Embed Size (px)

Citation preview

Page 1: [아이펀팩토리] 2017 NDCP

The 7 Lessons for Highly Effective Real-time ServerG r e a t Te c h n o l o g y F o r G r e a t G a m e s

D K M o o nd k m o o n @ i f u n f a c t o r y. c o m

1

Page 2: [아이펀팩토리] 2017 NDCP

✓ Worked on six MMORPG game servers at Nexon from 1999 thru 2005.

✓ Left it for a better opportunity in the States.

✓ Returned to it for no better place. (unfortunately…)

✓ Worked on mobile game server framework from 2011 thru 2013.

✓ Left again to start a business in the game server software industry.

Personal Relationship with NexonGreat�Technology�For�Great�Games

2

Page 3: [아이펀팩토리] 2017 NDCP

✓ Case studies on server-related issues and lessons from them.

✓ Reference game #1: PC MMORPG

• Peak CCU: 140K

• Peak CCU/Server: 15K

✓ Reference game #2: Mobile real-time MO game

• Being developed by Nexon/IFF and published by Tencent

• Passed 3 beta tests (CCU confidential)

About This TalkGreat�Technology�For�Great�Games

3

Page 4: [아이펀팩토리] 2017 NDCP

Game Service Architecture (Data Plane)Great�Technology�For�Great�Games

4

Game Client

Game Server

DB Server

Cache

Load balancer / Switch

Page 5: [아이펀팩토리] 2017 NDCP

Game Service Architecture (Data Plane)Great�Technology�For�Great�Games

5

Game Client

Game Server

DB Server

Cache

Presentation Layer

Access Layer

Logic/Application Layer

Cache Layer

Persistence Layer

Load balancer / Switch

Page 6: [아이펀팩토리] 2017 NDCP

✓ Case: saturated NIC & oversubscribed network

• In the past, network interface card (NIC) could be saturated.

• This does not happen any more, but core switch can be choked.

✓ Observation

• Real-time MMO: 200-300 Bps

• REST-based mobile: 700-800 Bps w/ spikes

✓ Suggestion

• Meter your traffic

Lesson #1Know Your Traffic Pattern

Great�Technology�For�Great�Games

6

… …

Core switch

Page 7: [아이펀팩토리] 2017 NDCP

✓ Case: saturated NIC & oversubscribed network

• In the past, network interface card (NIC) could be saturated.

• This does not happen any more, but core switch can be choked.

✓ Observation

• Real-time MMO: 200-300 Bps.

• REST-based mobile: 700-800 Bps w/ spikes.

✓ Suggestion

• Meter your traffic

Lesson #1Know Your Traffic Pattern

Great�Technology�For�Great�Games

7

… …

Core switch

Page 8: [아이펀팩토리] 2017 NDCP

✓ Case: saturated NIC & oversubscribed network

• In the past, network interface card (NIC) could be saturated.

• This does not happen any more, but core switch can be choked.

✓ Observation

• Real-time MMO: 200-300 Bps.

• REST-based mobile: 700-800 Bps w/ spikes.

✓ Suggestion

• Meter your traffic.

• Prefer binary message format.

• Also, correlate traffic to CPU usage.

Lesson #1Know Your Traffic Pattern

Great�Technology�For�Great�Games

8

… …

Core switch

Page 9: [아이펀팩토리] 2017 NDCP

✓ Case: Memory copy functions always at the top of profiling results.

✓ Reference CPU usage: 50-70%

• Too low figures: inefficient program concurrency.

• Too high figures: unnecessary memory copying, looping.

✓ Suggestion

• Pass around packets by pointer.

• Minimizing packet sizes also helps here, too.

Lesson #2Avoid Copying Packets

Great�Technology�For�Great�Games

9

Page 10: [아이펀팩토리] 2017 NDCP

✓ Case: Memory copy functions always at the top of profiling results.

✓ Target CPU usage: 50-70%

• Too low figures: inefficient program concurrency.

• Too high figures: unnecessary memory copying, looping.

✓ Suggestion

• Pass around packets by pointer.

• Minimizing packet sizes also helps here, too.

Lesson #2Avoid Copying Packets

Great�Technology�For�Great�Games

10

Page 11: [아이펀팩토리] 2017 NDCP

✓ Case: Memory copy functions always at the top of profiling results.

✓ Target CPU usage: 50-70%

• Too low figures: inefficient program concurrency.

• Too high figures: unnecessary memory copying, looping.

✓ Suggestion

• Pass around packets by pointer.

• Minimizing packet sizes also helps here, too.

Lesson #2Avoid Copying Packets

Great�Technology�For�Great�Games

11

Page 12: [아이펀팩토리] 2017 NDCP

✓ Case: Adopted lightweight byte-by-byte XOR encryption. Never being hacked into encryption algorithm. Instead, lots of packet replay attacks and client hack attempts.

✓ Observation

• Hackers do not bother reverse-engineering encryption algorithm.

• Instead, they hack into the client and let it do the encryption job.

✓ Suggestion

• Pick a lightweight encryption algorithm as long as it can prevent packet forgery. (Complex algorithm uses up too much CPU.)

• More focus on preventing game client manipulation.

• Also, prepare for packet replay attacks.

Lesson #3Focus on Client Obfuscation

Great�Technology�For�Great�Games

12

Page 13: [아이펀팩토리] 2017 NDCP

✓ Case: Adopted lightweight byte-by-byte XOR encryption. Never being hacked into encryption algorithm. Instead, lots of packet replay attacks and client hack attempts.

✓ Observation

• Hackers do not bother reverse-engineering encryption algorithm.

• Instead, they hack into the client and let it do the encryption job.

✓ Suggestion

• Pick a lightweight encryption algorithm as long as it can prevent packet forgery. (Complex algorithm uses up too much CPU.)

• More focus on preventing game client manipulation.

• Also, prepare for packet replay attacks.

Lesson #3Focus on Client Obfuscation

Great�Technology�For�Great�Games

13

Page 14: [아이펀팩토리] 2017 NDCP

✓ Case: Adopted lightweight byte-by-byte XOR encryption. Never being hacked into encryption algorithm. Instead, lots of packet replay attacks and client hack attempts.

✓ Observation

• Hackers do not bother reverse-engineering encryption algorithm.

• Instead, they hack into the client and let it do the encryption job.

✓ Suggestion

• Pick a lightweight encryption algorithm as long as it can prevent packet forgery. (Complex algorithm uses up too much CPU.)

• More focus on preventing game client manipulation.

• Also, prepare for packet replay attacks.

Lesson #3Focus on Client Obfuscation

Great�Technology�For�Great�Games

14

Page 15: [아이펀팩토리] 2017 NDCP

✓ Case: Broadcasting hinders scalability in both CPU and network BW. Refactored multiple times for visibility-based multicasting.

✓ Observation

• Along the memory copy, loop for broadcasting is the key reason for high CPU usage.

• Such broadcasting triggers packet copies, too. (because different players use different encryption seeds.)

✓ Suggestion

• Avoid broadcasting.

• Manage players list in a way of easy multicasting.

Lesson #4Use Multicasting with Limited Scope

Great�Technology�For�Great�Games

15

Page 16: [아이펀팩토리] 2017 NDCP

✓ Case: Broadcasting hinders scalability in both CPU and network BW. Refactored multiple times for visibility-based multicasting.

✓ Observation

• Along the memory copy, loop for broadcasting is the key reason for high CPU usage.

• Such broadcasting triggers packet copies, too. (because different players use different encryption seeds.)

✓ Suggestion

• Avoid broadcasting.

• Manage players list in a way of easy multicasting.

Lesson #4Use Multicasting with Limited Scope

Great�Technology�For�Great�Games

16

Page 17: [아이펀팩토리] 2017 NDCP

✓ Case: Broadcasting hinders scalability in both CPU and network BW. Refactored multiple times for visibility-based multicasting.

✓ Observation

• Along the memory copy, loop for broadcasting is the key reason for high CPU usage.

• Such broadcasting triggers packet copies, too. (because different players use different encryption seeds.)

✓ Suggestion

• Avoid broadcasting.

• Manage players list in a way of easy multicasting.

Lesson #4Use Multicasting with Limited Scope

Great�Technology�For�Great�Games

17

Page 18: [아이펀팩토리] 2017 NDCP

✓ Case: Heavily relied on DB transaction for inter-server synchronization DB gets overloaded. Servers gets serialized for DB I/O waiting.

✓ Observation

• Many programmers overuse DB xaction for synchronization.

• DB is heavyweight to guarantee properties like ACID, which means they pay extremely high costs for synchronization.

✓ Suggestion

• Use inter-server RPC or memory cache for synchronization.

Lesson #5Don’t Use DB as Synchronization Point

Great�Technology�For�Great�Games

18

Page 19: [아이펀팩토리] 2017 NDCP

✓ Case: Heavily relied on DB transaction for inter-server synchronization DB gets overloaded. Servers gets serialized for DB I/O waiting.

✓ Observation

• Many programmers overuse DB xaction for synchronization.

• DB is heavyweight to guarantee properties like ACID, which means they pay extremely high costs for synchronization.

✓ Suggestion

• Use inter-server RPC or memory cache for synchronization.

Lesson #5Don’t Use DB as Synchronization Point

Great�Technology�For�Great�Games

19

Page 20: [아이펀팩토리] 2017 NDCP

✓ Case: Heavily relied on DB transaction for inter-server synchronization DB gets overloaded. Servers gets serialized for DB I/O waiting.

✓ Observation

• Many programmers overuse DB xaction for synchronization.

• DB is heavyweight to guarantee properties like ACID, which means they pay extremely high costs for synchronization.

✓ Suggestion

• Use inter-server RPC or memory cache for synchronization.

Lesson #5Don’t Use DB as Synchronization Point

Great�Technology�For�Great�Games

20

Page 21: [아이펀팩토리] 2017 NDCP

✓ Case: Intensive use of REDIS for data sharing among the servers. REDIS quickly becomes a bottleneck.

✓ Observation

• Caching like REDIS is much lighter than DB for sure.

• But it also runs jobs to maintain persistency/consistency/availability.

• Direct server-to-server RPC may be a better solution in some cases.

✓ Suggestion

• Mind the persistency/consistency/availability setting of cache prog.

• Consider server-to-server RPC unless caching is unavoidable.

Lesson #6Caching is Not For Free

Great�Technology�For�Great�Games

21

Page 22: [아이펀팩토리] 2017 NDCP

✓ Case: Intensive use of REDIS for data sharing among the servers. REDIS quickly becomes a bottleneck.

✓ Observation

• Caching like REDIS is much lighter than DB for sure.

• But it also runs jobs to maintain persistency/consistency/availability.

• Direct server-to-server RPC may be a better solution in some cases.

✓ Suggestion

• Mind the persistency/consistency/availability setting of cache prog.

• Consider server-to-server RPC unless caching is unavoidable.

Lesson #6Caching is Not For Free

Great�Technology�For�Great�Games

22

Page 23: [아이펀팩토리] 2017 NDCP

✓ Case: Intensive use of REDIS for data sharing among the servers. REDIS quickly becomes a bottleneck.

✓ Observation

• Caching like REDIS is much lighter than DB for sure.

• But it also runs jobs to maintain persistency/consistency/availability.

• Direct server-to-server RPC may be a better solution in some cases.

✓ Suggestion

• Mind the persistency/consistency/availability setting of cache prog.

• Consider server-to-server RPC unless caching is unavoidable.

Lesson #6Caching is Not For Free

Great�Technology�For�Great�Games

23

Page 24: [아이펀팩토리] 2017 NDCP

✓ Case: Opens a service without any monitoring / operation tools.

✓ Observation

• The day of service open is the most hectic

• Also, the service on the day is the most buggy.

• Tools to understand what’s happening inside server is a must have.

• operation tools are mandatory unless you don’t want to sleep.

✓ Suggestion

• Spend enough time developing tools.

Lesson #7Start with Tools for Server Visibility

Great�Technology�For�Great�Games

24

Page 25: [아이펀팩토리] 2017 NDCP

✓ Case: Opens a service without any monitoring / operation tools.

✓ Observation

• The day of service open is the most hectic.

• Also, the service on the day is the most buggy.

• Tools to understand what’s happening inside server is a must have.

• Operation tools are mandatory unless you don’t want to sleep.

✓ Suggestion

• Spend enough time developing tools.

Lesson #7Start with Tools for Server Visibility

Great�Technology�For�Great�Games

25

Page 26: [아이펀팩토리] 2017 NDCP

✓ Case: Opens a service without any monitoring / operation tools.

✓ Observation

• The day of service open is the most hectic.

• Also, the service on the day is the most buggy.

• Tools to understand what’s happening inside server is a must have.

• Operation tools are mandatory unless you don’t want to sleep.

✓ Suggestion

• Spend enough time developing tools.

Lesson #7Start with Tools for Server Visibility

Great�Technology�For�Great�Games

26

Page 27: [아이펀팩토리] 2017 NDCP

✓ Understand your traffic pattern and try to minimize it.

✓ Avoid packet copies inside server.

✓ Focus on client obfuscation instead of complex network encryption.

✓ Prefer multicasting to broadcasting. Especially, consider visibility.

✓ Avoid using DB as a synchronization point. It will collapse.

✓ Caching is not for free. Do not rely on it too much.

✓ Develop a proper tools for server visibility before opening a service.

RecapGreat�Technology�For�Great�Games

27

Page 28: [아이펀팩토리] 2017 NDCP

Survey Result for FunGreat�Technology�For�Great�Games

28

✓ What OS for game server?

Count RateChoice

NA

No answer

Page 29: [아이펀팩토리] 2017 NDCP

Survey Result for FunGreat�Technology�For�Great�Games

29

✓ What language for game server?

Choice Count Rate

Page 30: [아이펀팩토리] 2017 NDCP

DK Moon

[email protected]

www.ifunfactory.com

THANKS!Grea t � Te c h n o l o g y � F o r � G r e a t � G ame s , � i F u n Fa c t o r y

30