Wednesday, February 7, 2018

[google-cloud-sql-discuss] Partial delete success: expected behavior? workarounds?

Hi,

I'm running into issues with large batch deletes that seem to succeed partially. I have three related tables - A stores child objects of B, B stores child objects of C. Let's call them Pets, People, and Cities - each person tends to have around one pet, but might have several or might have none. Each city has a lot of people.  I'm working on scaling up to larger cities - I'd like to get to millions eventually. Things worked fine up to around 1000 people, but when I got to around 5000 people, I got some weird behavior on deletion.

This is what I do when I delete a city:

My pets refer to my people by foreign key ID, so I have to remove the pets first. My query looks like:

DELETE FROM Pets WHERE person_id IN (SELECT person_id FROM People WHERE city_id=?)

Code looks like:


    PreparedStatement stat = connection.prepareStatement(<above statement>);

    stat.setString(1, city_id);

    return stat.executeUpdate();


Then if that succeeds (returns > 0 is the condition I'm using), I delete all the people (the people refer to city by foreign key ID) with a query like:

DELETE FROM People WHERE city_id=?

and similar code to the pets.

This is where it gets weird. My deletion of pets returns only ~2000 (one time it was 1405, another time 2803) out of the expected 5000 deletions. I don't get an exception or anything. The other ~3000 pets are still in the database - I can go inspect it manually and see them:

MySQL [adam1]> select count(*) from People ppl inner join Pets pts on ppl.person_id=pts.person_id where ppl.city="some city id here";+----------+| count(*) |+----------+| 2197 |+----------+1 row in set (0.16 sec)

Then of course my deletion of people fails because of a foreign key constraint violation. I don't see anything strange about my database at these points - CPU utilization hasn't ever gone over 70%, memory utilization is pretty low, no updates happening simultaneously. I do see tons and tons of messages like "Aborted connection <number> to db: '<db>' user: '<user>' host: 'cloudsqlproxy~<numbers>' (Got an error reading communication packets)", but those happen all the time (many per second, when the db is heavily in use), don't seem to be correlated with this issue.

Is this expected, that a delete partially succeeds? Is there a way for me to detect that it has only partially succeeded besides trying and failing to delete the referenced people?

The project ID is adam-dev-193118, database adam1. The pet, people, and city tables are Runs, Batches, and Projects. Example log is at 2018-02-07 17:43:16.107 EST. Happy to provide additional information.

Any help is appreciated!

-Laura

--
You received this message because you are subscribed to the Google Groups "Google Cloud SQL discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-sql-discuss+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-cloud-sql-discuss/26aa90a0-4b78-4c4d-b850-4d88b8c5258e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments:

Post a Comment