Database Tech Note: 7월 2016

EXADATA X5-2 Performance Test Memory Issue

Symptom

성능테스트 부하를 주기 시작하면 memory used 및 cache 가 빠른 속도로 증가하고 free memory가 고갈되는 시점에 서버 hang 상태로 빠져 테스트 불가.

free memory 가 고갈되기 전까지는 정상적으로 트래픽 수용 함.

특이하게 1번 노드의 memory used 가 매우 높은 상태 ( Node1:128G, Node2:12G)

양쪽 노드의 SGA 크기는 85G로 설정 되어 있음.

ASMM 사용 안함. (max_sga_size=85G)

Cause

1번 노드의 hugepage 설정이 크게 잡혀 있어 1번 노드의 used 가 기본적으로 많이 높은 상태를 유지하였다.

2번 노드는 실수로 hugepage 설정을 빠뜨려서 used가 낮은 상태를 유지된 것임.

[EXA]root@dbnode1:/root#grep Huge /proc/meminfo

HugePages_Total: 60000

HugePages_Free: 58982

HugePages_Rsvd: 0

HugePages_Surp: 0

Hugepagesize: 2048 kB

[EXA]root@dbnode2:/root#grep Huge /proc/meminfo

HugePages_Total: 1024

HugePages_Free: 6

HugePages_Rsvd: 0

HugePages_Surp: 0

Hugepagesize: 2048 kB

커널 파라미터의 hugepage 사용량을 설정한 만큼은 기본적으로 memory used 로 할당 된다.

즉, 1번 노드는 설정에 따라 HugePages_Total x Hugepagesize = 60000 X 2048K = 117G 만틈은 항상 used 로 할당된 것이다.

60000으로 설정한 이유는 과거 11g Exadata 서버 구축 시 hugepage 설정이 SGA 크기보다 작으면 에러가 발생하여 설정한 것이다.

use_large_pages DB parameter 에 의해 SGA할당 시 hugepage 사용 여부가 결정되는데 TRUE(사용), FALSE(미사용), ONLY(무조건사용) 중에 하나의 값을 가진다.

과거 11g Exadata 에서는 ONLY로 설정되어 있어서 SGA증가 했을 시 DB가 기동되지 않았고 hugepage를 먼저 크게 설정한 다음 기동했어야 했다.

그런데 12c 에서는 use_large_pages 가 FALSE 로되어 있어서 1번 노드에서 미리 크게 잡아놓은 hugepage는 무의미해지고 free memory 로부터 SGA를 할당하여 메모리 고갈이 발생한 것이다.

SQL> show parameter use_large_pages

NAME TYPE VALUE

------------------------------------ ----------- ------------------------------

use_large_pages string FALSE

과거 11g 버전의 Exadata 에서는 초기 셋팅 시 use_large_pages=ONLY 로 변경하는 절차가 있었으나..

고객이 SGA를 늘릴 경우 DB가 올라오지 않는 혼선으로 인해 최근 부터는 use_large_pages=ONLY 로 적용하는 절차가 제외 되었다고 한다.

즉, DB버전 차이로 인한 문제가 아닌 오라클 설치 정책의 변화로 인한 차이이며, 12c 에서도 best practice는 SGA를 크게 잡을 시 use_large_pages=ONLY로 하고 hugepage를 사용하는 것이다.

Solution

양쪽 DB노드 모두 hugepage 사이즈를 동일하게 크게 설정하고 use_large_pages=ONLY 로 적용.

[EXA]root@dbnode1:/root# sysctl -w vm.nr_hugepages=51200

[EXA]root@dbnode1:/root# echo "vm.nr_hugepages=51200" >> /etc/sysctl.conf

[EXA]root@dbnode1:/root# cat /etc/sysctl.conf | grep nr_huge

# vm.nr_hugepages = 1024

vm.nr_hugepages=51200

[EXA]root@dbnode1:/root#grep Huge /proc/meminfo

HugePages_Total: 51200

HugePages_Free: 1669

HugePages_Rsvd: 119

HugePages_Surp: 0

Hugepagesize: 2048 kB

SQL> alter system set use_large_pages scope=both sid='*';

SQL> show parameter use_large_pages

NAME TYPE VALUE

------------------------------------ ----------- ------------------------------

use_large_pages string ONLY

적용 후 성능테스트 부하를 발생했을 시 SGA를 hugepage로 미리 할당된 used에서 사용하므로 메모리 사용량에 큰 변화가 크게 없다.

Conclusion

SGA 설정 시 hugepage 사용 여부 및 관련 파라미터도 확인하자.

* DB Parameter : use_large_pages

* Kernel Parameter : vm.nr_hugepages

EXADATA X5-2 Performance Test Network Issue

Symptom

성능테스트 부하를 실행 했을 시 1번 노드의 SQL 트래픽이 정상 유입량의 절반 수준으로만 처리됨.

동일한 트래픽을 2번 노드에서만 실행했을 시에는 정상 수준으로 처리됨.

네트웍 트래픽도 양쪽 노드를 비교 했을 시 1번 노드가 현저히 낮은 상태.

양쪽 노드의 AWR 리포트 비교했을 시 1번 노드에서만 네트워크 관련 대기 이벤트가 높게 나타남.

1번 노드에 뭔가 문제가 있는 것으로 판단.

DB Node1 QPS 및 네트워크 모니터링

DB Node2 QPS 및 네트워크 모니터링

AWR Report

Cause

1번 노드의 네트웍카드 설정이 100Mb/s 로 되어 있음. (2번 노드는 1000 Mb/s)

100Mb/s 로 잡힌 이유는 public switch 와 통신 시 auto negotiation 과정에서 네트웍 속도를 결정하는데 몇번의 서버 리붓 과정에서 100Mb/s 로 잡힌 것이었다.

네트웍 카드와 스위치가 모두 1000Mb/s를 지원하면 보통은 1000Mb/s 로 된다고 하는데 100Mb/s 로 잡힌 명확한 이유는 알 수 없다. (네트웍인프라에서는 간혹 있는 일이라 함)

[EXA]root@dbnode1:/root# ethtool eth1

Settings for eth1:

Supported ports: [ TP ]

Supported link modes: 100baseT/Full

1000baseT/Full

10000baseT/Full

Supported pause frame use: No

Supports auto-negotiation: Ye s

Advertised link modes: 100baseT/Full

1000baseT/Full

10000baseT/Full

Advertised pause frame use: No

Advertised auto-negotiation: Yes

Speed: 100Mb/s

Du plex: Full

Port: Twisted Pair

PHYAD: 0

Transceiver: external

Auto-negotiation: on

MDI-X: Unknown

Supports Wake-on: d

Wake-on: d

Current message level: 0x00000007 (7)

drv probe link

Link detected: yes

[EXA]root@dbnode1:/root#

[EXA]root@dbnode1:/root# ethtool eth2

Settings for eth2:

Supported ports: [ TP ]

Supported link modes: 100baseT/Full

1000baseT/Full

10000baseT/Full

Supported pause frame use: No

Supports auto-negotiation: Yes

Advertised link modes: 100baseT/Full

1000baseT/Full

10000baseT/Full

Advertised pause frame use: No

Advertised auto-negotiation: Yes

Speed: 100Mb/s

Duplex: Full

Port: Twisted Pair

PHYAD: 0

Transceiver: external

Auto -negotiation: on

MDI-X: Unknown

Supports Wake-on: d

Wake-on: d

Current message level: 0x00000007 (7)

drv probe link

Link detected: yes

Solution

ethtool 명령어로 1000Mb/s 로 적용하고 성능테스트 진행하였고 정상 QPS 트래픽 유입 확인.

eth1, eth2 는 본딩으로 이중화 구성되어 있으므로 둘다 적용.

[EXA]root@dbnode1:/root# ethtool -s eth1 speed 1000 duplex full autoneg on

[EXA]root@dbnode1:/root# ethtool -s eth2 speed 1000 duplex full autoneg on

[EXA]root@dbnode1:/root#

... 약 1분뒤 적용됨...

[EXA]root@dbnode1:/root# ethtool eth1

Settings for eth1:

Supported ports: [ TP ]

Supported link modes: 100baseT/Full

1000baseT/Full

10000baseT/Full

Supported pause frame use: No

Supports auto-negotiation: Yes

Advertised link modes: 1000baseT/Full

Advertised pause frame use: No

Advertised auto-negotiation: Yes

Speed: 1000Mb/s

Duplex: Full

Port: Twisted Pair

PHYAD: 0

Transceiver: external

Auto-negotiation: on

MDI-X: Unknown

Supports Wake-on: d

Wake-on: d

Current message level: 0x00000007 (7)

drv probe link

Link detected: yes

[EXA]root@dbnode1:/root#

[EXA]root@dbnode1:/root# ethtool eth2

Settings for eth2:

Supported ports: [ TP ]

Supported link modes: 100baseT/Full

1000baseT/Full

10000baseT/Full

Supported pause frame use: No

Supports auto-negotiation: Yes

Advertised link modes: 1000baseT/Full

Advertised pause frame use: No

Advertised auto-negotiation: Yes

Speed: 1000Mb/s

Duplex: Full

Port: Twisted Pair

PHYAD: 0

Transceiver: external

Auto-negotiation: on

MDI-X: Unknown

Supports Wake-on: d

Wake-on: d

Current message level: 0x00000007 (7)

drv probe link

Link detected: yes

이런 위험을 방지하기 위해 자동이 아닌 강제(autoneg off) 로 1000Mb/s를 지정하려고 했으나 Exadata DB서버상에는 강제할 수 없었다.

[EXA]root@dbnode1:/root# ethtool -s eth2 speed 1000 duplex full autoneg off

Cannot set new settings: Invalid argument

not setting speed

not setting duplex

not setting autoneg

Oracle Linux 6: Network Limitation of a 1GB Card over Auto Negotiation (Doc ID 1904433.1)

"According to IEEE 802.3-2002 specification, you can not disable auto negotiation when using a 1000baseT NIC"

만약의 경우를 대비해서 스위치에만 autoneg off 로 강제지정하였고 리붓시에도 제대로 적용되도록 DB서버 모두 아래와 같이 ETHTOOL_OPTS 설정을 추가 함.

[EXA]root@dbnode1:/root# cat /etc/sysconfig/network-scripts/ifcfg-eth1 | grep ETHTOOL_OPTS

ETHTOOL_OPTS="speed 1000 duplex full autoneg on"

[EXA]root@dbnode1:/root# cat /etc/sysconfig/network-scripts/ifcfg-eth2 | grep ETHTOOL_OPTS

ETHTOOL_OPTS="speed 1000 duplex full autoneg on"

[EXA]root@dbnode2:/root# cat /etc/sysconfig/network-scripts/ifcfg-eth1 | grep ETHTOOL_OPTS

ETHTOOL_OPTS="speed 1000 duplex full autoneg on"

[EXA]root@dbnode2:/root# cat /etc/sysconfig/network-scripts/ifcfg-eth2 | grep ETHTOOL_OPTS

ETHTOOL_OPTS="speed 1000 duplex full autoneg on"

Conclusion

Exadata 와 상관 없이 외부스위치 설정으로 인해 네트웍 속도가 결정될 수 있다.

Exadata는 고가이고 훌륭한 DB 어플라이언스 제품이지만 성능테스트로 꼼꼼하게 확인할 필요가 있다.

SQL 처리량이 Product 에서 캡쳐한 시점과 차이가 많이 나면 꼼꼼하게 확인하여 원인 분석 하자.

EXADATA X5-2 Performance Test DB Issue - 12C Adaptive Optimization

Symptom

1번노드에서 트래픽 유입 중 특정 시점 이후로 library cache lock 대기 이벤트가 많이 발생 함.

13:57 부터 강제 vip fail-over 로 1번 노드로 전체 트래픽 유입.

fail-over 까지는 정상이나 약 10여분 뒤 14:08 시점 부터 1번 노드에서 library cache lock 을 대기하는 active session 이 약 100여개 정도 급증하면서 SQL처리량 감소.

/* Active Session */
SQL> select inst_id, username, program, module, event, sql_id
     from gv$session where status='ACTIVE'
     order by 1,2,4

   INST_ID USERNAME   PROGRAM                               MODULE                         EVENT                               SQL_ID
---------- ---------- ------------------------------------- ------------------------------ ----------------------------------- -------------
         1 P_USR      wrc@oratest (TNS V1-V3)               JDBC Thin Client               library cache lock                  bttd0gttbjx9w
         1 P_USR      wrc@oratest (TNS V1-V3)               JDBC Thin Client               library cache: mutex X              bttd0gttbjx9w
         1 P_USR      wrc@oratest (TNS V1-V3)               JDBC Thin Client               library cache lock                  bttd0gttbjx9w
         1 P_USR      wrc@oratest (TNS V1-V3)               JDBC Thin Client               library cache lock                  bttd0gttbjx9w
         1 P_USR      wrc@oratest (TNS V1-V3)               JDBC Thin Client               library cache lock                  bttd0gttbjx9w
         1 P_USR      wrc@oratest (TNS V1-V3)               JDBC Thin Client               library cache lock                  bttd0gttbjx9w
         1 P_USR      wrc@oratest (TNS V1-V3)               JDBC Thin Client               library cache lock                  bttd0gttbjx9w
....
         1 P_USR      wrc@oratest (TNS V1-V3)               JDBC Thin Client               library cache lock                  bttd0gttbjx9w
         1 P_USR      wrc@oratest (TNS V1-V3)               JDBC Thin Client               library cache lock                  bttd0gttbjx9w
         1 P_USR      wrc@oratest (TNS V1-V3)               JDBC Thin Client               library cache: mutex X              bttd0gttbjx9w
         1 P_USR      wrc@oratest (TNS V1-V3)               JDBC Thin Client               library cache lock                  bttd0gttbjx9w
         1 P_USR      wrc@oratest (TNS V1-V3)               JDBC Thin Client               library cache lock                  bttd0gttbjx9w
         1 P_USR      wrc@oratest (TNS V1-V3)               JDBC Thin Client               library cache lock                  bttd0gttbjx9w
         1 P_USR      wrc@oratest (TNS V1-V3)               JDBC Thin Client               library cache lock                  bttd0gttbjx9w
         1 P_USR      wrc@oratest (TNS V1-V3)               JDBC Thin Client               library cache: mutex X              bttd0gttbjx9w
         1 P_USR      wrc@oratest (TNS V1-V3)               JDBC Thin Client               library cache lock                  bttd0gttbjx9w
         1 P_USR      wrc@oratest (TNS V1-V3)               JDBC Thin Client               library cache lock                  bttd0gttbjx9w
         1 P_USR      oracle@exaoradb01.melon.com (O000)                                   class slave wait

Cause

12c New Feature 인 Adaptive Optimization 관련 버그.

My Oracale Support, Bug 19490852 : EXCESSIVE LIBRARY CACHE LOCK

12.2 버전 부터 해결 예정으로 현재(2016년 7월) 최신 버전은 12.1.0.2 로서 버그 패치는 없고 workaround 만 있는 상태.

SQL> select * from v$version;

BANNER                                                                               CON_ID
-------------------------------------------------------------------------------- ----------
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production              0
PL/SQL Release 12.1.0.2.0 - Production                                                    0
CORE    12.1.0.2.0      Production                                                        0
TNS for Linux: Version 12.1.0.2.0 - Production                                            0
NLSRTL Version 12.1.0.2.0 - Production                                                    0

Solution

Adaptive Optimization 기능을 off 하는 workaound 로 해결.

SQL> alter system set optimizer_adaptive_features=false scope=both sid='*';
SQL> alter system set optimizer_dynamic_sampling=0 scope=both sid='*';

적용 후 정상 결과 확인.

Conclusion

New Feature에 대한 사이트 이펙트를 미리 확인해보기 위해서라도 성능테스트를 꼼꼼하게 하자.

추후 12.2 업그레이드 시 Adaptive Optimization관련 사이드이펙트를 확인해 보자.

2016년 7월 11일 월요일

EXADATA X5-2 Performance Test Memory Issue

EXADATA X5-2 Performance Test Network Issue

EXADATA X5-2 Performance Test DB Issue - 12C Adaptive Optimization