Wednesday, April 25, 2018

[google-cloud-sql-discuss] Re: Postgres Connections Randomly Dropping

As you suggesting you receive such errors only for 2-3 minutes in a day, I suspect your instance might be going through maintenance updates (that require an instance restart) during that times.  Please view operational logs of your Postgre SQL instance. You can view them from cloud console GUI inside instance details view, under operations tab. If you find instance was updated at the same time, that explain the cause of these logs. 

If that is the real cause of the mentioned errors, I will recommend configuring the schedule for the Maintenance window and Maintenance timing to avoid any surprises in the future. 

Let me know if this helps?

Regards,


On Wednesday, April 25, 2018 at 12:59:51 PM UTC-4, Nigel Gutzmann wrote:
I have a django and celery application running inside of Google Kubernetes Engine. I am connecting to my CloudSQL instance (postgres) using a Kubernetes service running the CloudSQL Proxy. Database connections and queries generally work fine, but occasionally we get spurts of errors with connections breaking. They are raised in python like this:

OperationalError: could not connect to server: Connection refused Is the server running on host "cloudsql-proxy-service" and accepting TCP/IP connections on port 3306?

or

OperationalError: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request.

I can't find anything that might cause that in the logs of the CloudSQL instance. There are some messages like this in the CloudSQL proxy logs:

2018/04/24 18:55:18 Instance <project_name>:us-central1:<instance_name> closed connection

But I can't necessarily correlate the timestamps between when those messages appear and when we get the python errors. I have tried setting CONN_MAX_AGE and tcp keepalives like this inside django's settings.py:

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql',
        'NAME': '<db_name>',
        'USER': os.environ.get('DB_USER', None),
        'HOST': os.environ.get('DB_HOST', None),
        'PORT': os.environ.get('DB_PORT', None),
        'PASSWORD': os.environ.get('DB_PASSWORD', None),
        'CONN_MAX_AGE': int(os.environ.get('CONN_MAX_AGE', 0)),
        'OPTIONS': {
            'keepalives': 1,
            'keepalives_idle': 480,
            'keepalives_interval': 10,
            'keepalives_count': 3,
        },
    },
}

But that didn't seem to make a difference. We still get the same errors in bunches, about 20 errors over the span of 2-3 minutes, 2-3 times per day.

--
You received this message because you are subscribed to the Google Groups "Google Cloud SQL discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-sql-discuss+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-cloud-sql-discuss/39d4acf1-fa0f-4672-ab08-019ff6d89ee8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments:

Post a Comment