Thursday, May 4, 2017

Re: [google-cloud-sql-discuss] Re: PostgreSQL: to increase shared_buffers

Hey Alexsander,

The question was merely a diagnostic one - to see whether the change would affect the runtime. I find it very odd that enhancing the CPU and RAM simultaneously didn't even change the runtime at all, but that's something to keep in mind as we try to gather more information. Testing what effect the number of regular expressions has could be part of that process. As I said above, I'm no expert on Postgres and maybe the query planner is such that it doesn't matter, but we might as well find out.

Cheers,

Nick
Cloud Platform Community Support

On Thursday, May 4, 2017 at 1:13:37 AM UTC-4, Aleksander Efremov wrote:
Regular expressions will be set by users. Therefore I can't to opportunity to execute any optimizations.
I want to have opportunity to change caches, buffers, etc. I have wrote above.

четверг, 4 мая 2017 г., 2:28:11 UTC+4 пользователь paynen написал:
Hey Alexander,

I'm not much of an expert on Postgres, but maybe you can see if combining some of the regular expressions affects the duration?

Cheers,

Nick
Cloud Platform Community Support

On Wednesday, May 3, 2017 at 5:21:38 AM UTC-4, Aleksander Efremov wrote:
I tried to increase of RAM, size of SSD (read/write rate),  number of CPU, but it didn't bring results. Query is always execute approximately one minute 30 seconds. 
How I can to speed up query?

Here http://pgtune.leopard.in.ua/ I have received settings, but how I can to set they:
```
max_connections = 100 shared_buffers = 2559744MB effective_cache_size = 7679232MB work_mem = 26211778kB maintenance_work_mem = 2GB min_wal_size = 1GB max_wal_size = 2GB checkpoint_completion_target = 0.7 wal_buffers = 16MB default_statistics_target = 100
```

Query:
```
select * from "discovered_urls" where domain_id=5 and is_fetched=false and task_created=false and
url !~ E'^https?:\\/\\/www\\.zeit\\.de\\/[-/a-z0-9]+\\/index$' AND
url !~ E'^https?:\\/\\/www\\.zeit\\.de\\/thema\\/[-/a-z0-9]+$' AND
url !~ E'^https?:\\/\\/www\\.zeit\\.de\\/[-/a-z0-9]+\\/seite-[0-9]$' AND
url !~ E'^https?:\\/\\/www\\.zeit\\.de\\/[-/a-z0-9]+\\/komplettansicht$' and
(url ~ E'^https?:\\/\\/www\\.zeit\\.de\\/[-/a-z0-9]+$') limit 101
```

Plan:
```
Limit  (cost=341.52..42790.50 rows=1 width=132)
  ->  Bitmap Heap Scan on discovered_urls  (cost=341.52..42790.50 rows=1 width=132)
        Recheck Cond: (domain_id = 5)
        Filter: ((NOT is_fetched) AND (NOT task_created) AND (url !~ '^https?:\/\/www\.zeit\.de\/[-/a-z0-9]+\/index$'::text) AND (url !~ '^https?:\/\/www\.zeit\.de\/thema\/[-/a-z0-9]+$'::text) AND (url !~ '^https?:\/\/www\.zeit\.de\/[-/a-z0-9]+\/seite-[0-9]$'::text) AND (url !~ '^https?:\/\/www\.zeit\.de\/[-/a-z0-9]+\/komplettansicht$'::text) AND (url ~ '^https?:\/\/www\.zeit\.de\/[-/a-z0-9]+$'::text))
        ->  Bitmap Index Scan on discovered_urls_domain_id_is_fetched_task_created_index  (cost=0.00..341.52 rows=14487 width=0)
              Index Cond: ((domain_id = 5) AND (is_fetched = false) AND (task_created = false))

```

вторник, 2 мая 2017 г., 20:12:29 UTC+4 пользователь Aleksander Efremov написал:
ok, thanks. I will try to increase of memory. And then I will to see have it sense or not. Maybe I am mistaken and it can't be help.

вторник, 2 мая 2017 г., 19:58:11 UTC+4 пользователь Vladimir Rusinov написал:
Hi Alexander and thanks for feedback!

shared_buffers is a difficult param as it has performance vs stability / recovery time trade-offs and we don't want to compromise those.
We did some testing and found out that setting shared_buffers to around 30% of the instance memory yields best performance for tpc-b/c -like workloads while not compromising other features of the system.
So that's what we set shared_buffers to on sufficiently large instances (with > 3.5G ram as far as I remember).

Note that PostgreSQL will use OS-level disk buffers as well, so if your dataset fits in memory you are likely to see very little read operations regardless of value of shared_buffers.

All that said, we are certainly open for more tuning and we are eager to learn about different workloads. Would you be able to elaborate why do you think increasing shared_buffers benefit your workload?

Thanks!

--
Vladimir Rusinov
PostgreSQL SRE, Google Ireland

Google Ireland Ltd.,Gordon House, Barrow Street, Dublin 4, Ireland
Registered in Dublin, Ireland
Registration Number: 368047

On Mon, May 1, 2017 at 10:10 PM, Aleksander Efremov <mr.e...@gmail.com> wrote:
I execute the select query with the RegExp in condition on 2 million and above records and it spend much time.

вторник, 2 мая 2017 г., 9:05:13 UTC+4 пользователь Aleksander Efremov написал:
When you plan to inject this feature?

--
You received this message because you are subscribed to the Google Groups "Google Cloud SQL discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-sql-discuss+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-cloud-sql-discuss/00e196c7-2710-438e-a0ce-69ddb9632c69%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Google Cloud SQL discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-cloud-sql-discuss+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-cloud-sql-discuss/89f7880b-2ca5-4c24-a7d4-6c0072b24972%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments:

Post a Comment