We have a single master/single streaming replica Postgres 9.3 db on AWS. The load is not terribly high - this is a development/staging environment. (The production shows similar metrics). Point is the "ReplicaLag" shown in Cloudwatch oscillates wildly during the day between 0 and 200 seconds. I've changed max_wal_senders from 5 to 10 with no change.
Any suggestions for diagnosing this?
(This is a t2.small master and t2.small replica, however the production instance is large, and exhibits the same issue. CPU is < 2%, connections < 60, iops only about 5-15 count/second).