Wednesday, September 26, 2012

Re: Digest for google-cloud-sql-discuss@googlegroups.com - 1 Message in 1 Topic

Okay, I have appstats data. On the simple poke servlet the delay is all in the OpenConnection call. There's only one series of Open-Exec-Close calls, as expected.

RPC
 @0ms rdbms.OpenConnection real=2267ms api=0ms
 @2267ms rdbms.Exec real=3ms api=0ms
 @2271ms rdbms.CloseConnection real=2ms api=0ms



And here's the fastest poke response in the series, just a few moments earlier:

RPC
 @0ms rdbms.OpenConnection real=25ms api=0ms
 @25ms rdbms.Exec real=4ms api=0ms
 @29ms rdbms.CloseConnection real=2ms api=0ms


I now have the DB poke running every two minutes and the response times are still all over the place (confirmed with appstats). There was no other load on the site during this period:

 (35) 2012-09-26 20:00:00.130 "GET /tasks/poke_db" 200 real=84ms cpu=0ms api=0ms overhead=0ms (3 RPCs)
 (36) 2012-09-26 19:58:00.572 "GET /tasks/poke_db" 200 real=1525ms cpu=0ms api=0ms overhead=0ms (3 RPCs)
 (37) 2012-09-26 19:52:00.329 "GET /tasks/poke_db" 200 real=51ms cpu=0ms api=0ms overhead=0ms (3 RPCs)
 (38) 2012-09-26 19:50:00.283 "GET /tasks/poke_db" 200 real=2016ms cpu=0ms api=0ms overhead=0ms (3 RPCs)
 (39) 2012-09-26 19:48:00.445 "GET /tasks/poke_db" 200 real=1818ms cpu=0ms api=0ms overhead=0ms (3 RPCs)
 (40) 2012-09-26 19:46:00.368 "GET /tasks/poke_db" 200 real=58ms cpu=0ms api=0ms overhead=0ms (3 RPCs)
 (41) 2012-09-26 19:40:00.356 "GET /tasks/poke_db" 200 real=1944ms cpu=0ms api=0ms overhead=0ms (3 RPCs)
 (42) 2012-09-26 19:38:00.284 "GET /tasks/poke_db" 200 real=23ms cpu=0ms api=0ms overhead=0ms (3 RPCs)
 (43) 2012-09-26 19:36:00.906 "GET /tasks/poke_db" 200 real=282ms cpu=0ms api=0ms overhead=0ms (3 RPCs)
 (44) 2012-09-26 19:34:00.161 "GET /tasks/poke_db" 200 real=4729ms cpu=0ms api=0ms overhead=0ms (3 RPCs)
 (45) 2012-09-26 19:32:00.534 "GET /tasks/poke_db" 200 real=2604ms cpu=0ms api=0ms overhead=0ms (3 RPCs)


I've attached a nightmare response time -- 54 seconds --  from one of my more complex pages. That page can load just fine in 500-800ms. Sometimes the delays get into the 12s range. The 30-60s range is less common but not uncommon. 

Appstats helped me track down the source of my orphaned open connections, so at least my code is keeping things tidy with the DB on its end now. But that hasn't seemed to have made any difference to the performance.


Any other ideas or tests to try? 

---------------------------
Keith Mukai, M.Ed.
High School English Teacher
EssayTagger.com Founder & CEO

Volunteer Coach, Niles West HS Boys, Girls Gymnastics Teams



On Mon, Sep 24, 2012 at 9:39 AM, <google-cloud-sql-discuss@googlegroups.com> wrote:

Group: http://groups.google.com/group/google-cloud-sql-discuss/topics

    Keith Mukai <keith.mukai@essaytagger.com> Sep 23 09:05AM -0700  

    Hey Joe,
     
    I'm still using my Spring-Hibernate-CloudSQL framework, but I'm seeing some
    crazily unpredictable Cloud SQL response times. I have a trivial DB "poke"
    servlet that just opens a session, runs "SELECT 1", and reports an elapsed
    time. I have a cron job that hits the poke every five minutes and the
    execution time is all over the place. Here's a series of response times
    with me hitting the poke servlet and refreshing the page over and over
    again (in millis):
     
    67
    117
    59
    37
    317
    2764
    2404
    2080
    35
    28
    1459
    29
    31
    23
    3959
    68
    132
     
    The variability of these results just confounds me. If the code was doing
    something horribly inefficient, the results should be more uniformly
    terrible. But it's obviously capable of running the poke well enough. It's
    the 1000-3000ms responses that are freaking me out. I had zero other load
    during these tests.
     
    On DB-heavy pages, *each* DB hit can scale up to similar 3000+ms times,
    causing the whole page to take an awful 12-15 seconds to load. And then the
    next request a second later might take 250ms.
     
    At first I thought it was just a warmup issue (thus the creation of the
    poke task) -- and it does seem to do better when it's fielding constant
    requests -- but then a long load will happen again at unpredictable times
    (and not related to an /_ah/warmup).
     
    I have some issues on my end -- Spring isn't closing my Hibernate
    connections and I can't figure out why. But as request load increases, I
    tend to see better, more stable performance. Obviously we'd expect the
    opposite if those stale connections were bogging down the system. I'm not
    using a connection pool layer with Hibernate.
     
    I don't know if it's specifically a Cloud SQL issue, but that seems the
    most likely culprit at the moment.
     
    What other info can I provide to help diagnose what's going on?

     

You received this message because you are subscribed to the Google Group google-cloud-sql-discuss.
You can post via email.
To unsubscribe from this group, send an empty message.
For more options, visit this group.


No comments:

Post a Comment