Thursday, 9 May 2019

How to handling PDO MySQL fail over in AWS using persistent connections

We have experienced a problem with AWS Aurora failover and looking for pointers as to how to resolve.

Scenario

AWS Aurora set up with two end points:

  • Writer:
    • host: stackName-dbcluster-ID.cluster-ID.us-west-2.rds.amazonaws.com
    • resolves to IP: 10.1.0.X
  • Reader:
    • host: stackName-dbcluster-ID.cluster-ro-ID.us-west-2.rds.amazonaws.com
    • resolves to IP: 10.1.0.Y

So therefore our PDO MySQL Connection string is stackName-dbcluster-ID.cluster-ID.us-west-2.rds.amazonaws.com (for writing)

After failover

On failover, the DNS entries are flipped to point as follows:

  • Reader:
    • host: stackName-dbcluster-ID.cluster-ro-ID.us-west-2.rds.amazonaws.com
    • resolves to IP: 10.1.0.X
  • Writer:
    • host: stackName-dbcluster-ID.cluster-ID.us-west-2.rds.amazonaws.com
    • resolves to IP: 10.1.0.Y

Critically, the PDO Connection string (for writing) remains the same "stackName-dbcluster-ID.cluster-ID.us-west-2.rds.amazonaws.com" b ut points to a different IP address.

What Happened

We had error 1290 "SQLSTATE[HY000]: General error: 1290 The MySQL server is running with the --read-only option so it cannot execute this statement".

As the DB engines are stopped started, our initial persistent connections will have "gone away" and been invalidated (something we immediately handle in a reconnect/retry code).

However the error above means new connections will have been made to the old node, but then not further invalidated with propagation of the DNS change. They lasted 10/15 minutes (well beyond TTL of the DNS).

My Questions

  1. Does anyone know if a persistent connection on PDO is retrieved based on the connection string, or is more reliable using the IP or other signature? Evidence suggests it's hostname, but would like confirmation.
  2. Does anyone know a way to mark a persistent connection as "invalid" in PDO, so that is it not used again?
  3. Or, is there something I missed?

Side notes

We already have code in place to handle the retry, and they retry is told to get a new non-persistent connection (which works). It's at this point we could "invalidate" the PDO connection so the next run of a script does not repeat this cycle over and over.

The failover can happen at any time, so we're not in a position to do manual actions such as restart php (as we had to do this time).

Without persistent connections, performance is notably slower.

FastCGI, Centos 16, PHP 7.2, MySQLD 5.0.12-dev (which is normal on Centos - see https://superuser.com/questions/1433346/php-shows-outdated-mysqlnd-version)



from How to handling PDO MySQL fail over in AWS using persistent connections

No comments:

Post a Comment