MongoDB secondary completely unable to keep up

We have set up a three member replica set consisting of the following, all on MongoDB version 3.4:

Primary. Physical local server, Windows Server 2012, 64 GB RAM, 6 cores. Hosted in Scandinavia.
Secondary. Amazon EC2, Windows Server 2016, m4.2xlarge, 32 GB RAM, 8 vCPUs. Hosted in Germany.
Arbiter. Tiny cloud based Linux instance.

The problem we are seeing is that the secondary is unable to keep up with the primary. As we seed it with data (copy over from the primary) and add it to the replica set, it typically manages to get in sync, but an hour later it might lag behind by 10 minutes; a few hours later, it's an hour behind, and so on, until a day or two later, it goes stale.

We are trying to figure out why this is. The primary is consistently using 0-1% CPU, while the secondary is under constant heavy load at 20-80% CPU. This seems to be the only potential resource constraint. Disk and network load does not seem to be an issue. There seems to be some locking going on on the secondary, as operations in the mongo shell (such as db.getReplicationInfo()) frequently takes 5 minutes or more to complete, and mongostat rarely works (it just says i/o timeout). Here is output from mongostat during a rare instance when it reported stats for the secondary:

                   host insert query update delete getmore command dirty  used flushes vsize   res qrw  arw net_in net_out conn set repl                time
        localhost:27017     *0    33    743     *0       0   166|0  1.0% 78.7%       0 27.9G 27.0G 0|0  0|1  2.33m    337k  739  rs  PRI Mar 27 14:41:54.578
  primary.XXX.com:27017     *0    36    825     *0       0   131|0  1.0% 78.7%       0 27.9G 27.0G 0|0  0|0  1.73m    322k  739  rs  PRI Mar 27 14:41:53.614
secondary.XXX.com:27017     *0    *0     *0     *0       0   109|0  4.3% 80.0%       0 8.69G 7.54G 0|0 0|10  6.69k    134k  592  rs  SEC Mar 27 14:41:53.673

I ran db.serverStatus() on the secondary, and compared to the primary, and one number that stood out was the following:

"locks" : {"Global" : {"timeAcquiringMicros" : {"r" : NumberLong("21188001783")

The secondary had an uptime of 14000 seconds at the time.

Would appreciate any ideas on what this could be, or how to debug this issue! We could upgrade the Amazon instance to something beefier, but we've done that three times already, and at this point we figure that something else must be wrong.

I'll include output from db.currentOp() on the secondary below, in case it helps. (That command took 5 minutes to run, after which the following was logged: Restarting oplog query due to error: CursorNotFound: Cursor not found, cursor id: 15728290121. Last fetched optime (with hash): { ts: Timestamp 1490613628000|756, t: 48 }[-5363878314895774690]. Restarts remaining: 3)

"desc":"conn605",
"connectionId":605,"client":"127.0.0.1:61098",
"appName":"MongoDB Shell",
"secs_running":0,
"microsecs_running":NumberLong(16),
"op":"command",
"ns":"admin.$cmd",
"query":{"currentOp":1},
"locks":{},
"waitingForLock":false,
"lockStats":{}

"desc":"repl writer worker 10",
"secs_running":0,
"microsecs_running":NumberLong(14046),
"op":"none",
"ns":"CustomerDB.ed2112ec779f",
"locks":{"Global":"W","Database":"W"},
"waitingForLock":false,
"lockStats":{"Global":{"acquireCount":{"w":NumberLong(1),"W":NumberLong(1)}},"Database":{"acquireCount":{"W":NumberLong(1)}}}

"desc":"ApplyBatchFinalizerForJournal",
"op":"none",
"ns":"",
"locks":{},
"waitingForLock":false,
"lockStats":{}

"desc":"ReplBatcher",
"secs_running":11545,
"microsecs_running":NumberLong("11545663961"),
"op":"none",
"ns":"local.oplog.rs",
"locks":{},
"waitingForLock":false,
"lockStats":{"Global":{"acquireCount":{"r":NumberLong(2)}},"Database":{"acquireCount":{"r":NumberLong(1)}},"oplog":{"acquireCount":{"r":NumberLong(1)}}}

"desc":"rsBackgroundSync",
"secs_running":11545,
"microsecs_running":NumberLong("11545281690"),
"op":"none",
"ns":"local.replset.minvalid",
"locks":{},
"waitingForLock":false,
"lockStats":{"Global":{"acquireCount":{"r":NumberLong(5),"w":NumberLong(1)}},"Database":{"acquireCount":{"r":NumberLong(2),"W":NumberLong(1)}},"Collection":{"acquireCount":{"r":NumberLong(2)}}}

"desc":"TTLMonitor",
"op":"none",
"ns":"",
"locks":{"Global":"r"},
"waitingForLock":true,
"lockStats":{"Global":{"acquireCount":{"r":NumberLong(35)},"acquireWaitCount":{"r":NumberLong(2)},"timeAcquiringMicros":{"r":NumberLong(341534123)}},"Database":{"acquireCount":{"r":NumberLong(17)}},"Collection":{"acquireCount":{"r":NumberLong(17)}}}

"desc":"SyncSourceFeedback",
"op":"none",
"ns":"",
"locks":{},
"waitingForLock":false,
"lockStats":{}

"desc":"WT RecordStoreThread: local.oplog.rs",
"secs_running":1163,
"microsecs_running":NumberLong(1163137036),
"op":"none",
"ns":"local.oplog.rs",
"locks":{},
"waitingForLock":false,
"lockStats":{"Global":{"acquireCount":{"r":NumberLong(1),"w":NumberLong(1)}},"Database":{"acquireCount":{"w":NumberLong(1)}},"oplog":{"acquireCount":{"w":NumberLong(1)}}}

"desc":"rsSync",
"secs_running":11545,
"microsecs_running":NumberLong("11545663926"),
"op":"none",
"ns":"local.replset.minvalid",
"locks":{"Global":"W"},
"waitingForLock":false,
"lockStats":{"Global":{"acquireCount":{"r":NumberLong(272095),"w":NumberLong(298255),"R":NumberLong(1),"W":NumberLong(74564)},"acquireWaitCount":{"W":NumberLong(3293)},"timeAcquiringMicros":{"W":NumberLong(17685)}},"Database":{"acquireCount":{"r":NumberLong(197529),"W":NumberLong(298255)},"acquireWaitCount":{"W":NumberLong(146)},"timeAcquiringMicros":{"W":NumberLong(651947)}},"Collection":{"acquireCount":{"r":NumberLong(2)}}}

"desc":"clientcursormon",
"secs_running":0,
"microsecs_running":NumberLong(15649),
"op":"none",
"ns":"CustomerDB.b72ac80177ef",
"locks":{"Global":"r"},
"waitingForLock":true,
"lockStats":{"Global":{"acquireCount":{"r":NumberLong(387)},"acquireWaitCount":{"r":NumberLong(2)},"timeAcquiringMicros":{"r":NumberLong(397538606)}},"Database":{"acquireCount":{"r":NumberLong(193)}},"Collection":{"acquireCount":{"r":NumberLong(193)}}}}],"ok":1}

MongoDB secondary completely unable to keep up

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Windows Update / Microsoft Update の接続先 URL について

Bureau of Internal Revenue: Regional Offices (Directory)

A/L Technology Stream – Subject combinations, Syllabuses and Teacher guides

The 10 Tennessee Cities With The Largest Black Population For 2021

NCERT Solutions for Class 10th: Ch 5 Les médias French

99 God Status for Whatsapp, Facebook

Download: Bicko Bicko ft Rich Bizzy & Crew G- Wanfulanganya (Prod by: Bicko...

Nalgonda District Police Office Mobile Numbers List in Telangana State

SAHARA FLASH LIVE IN WERAGOLLA 2018-04-20

Error 1920. Service VMware Blast (VMBlast) failed to start.

IIS 観点でアンチウイルススキャン対象から除外したいフォルダ

Afzal Hai Kul Jahan Se Gharana Hussain Ka

Re: clean install, hardware monitoring service on this host is not responding

Black Angus Grilled Artichokes

Moondru Mudichu 27-12-2016 – Polimer tv Serial

मतलबी दुनिया स्टेटस – Matlabi Duniya Status in Hindi | Selfish Status

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

CIERA PERNELL