[MapProxy] Most performant storage option?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[MapProxy] Most performant storage option?

Jeff Konnen-2
Hi all,

we host several of hundreds of caches in a fie system that resides on a NFS disk which is mounted on different hosts with mapproxy containers accessing these files.

It works well but we are hitting some I/O Problems on the NFS server. That's why we are considering using a distributed system right now.

Does anyone have experience with mapproxy on riak, s3 (minio?) ?

What would your recommendation be?

We don't have any of these components running today, so given that we would have to create a new infrastructure from scratch, which one would you recommend?

Best regards
Jeff

--
Jeff Konnen

_______________________________________________
MapProxy mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/mapproxy
Reply | Threaded
Open this post in threaded view
|

Re: [MapProxy] Most performant storage option?

Travis Kirstine
We have been using a riak cache for a few years with good results, without going into too much detail here is our basic setup and comments:

- we use mapproxy and riak to store fully seeded imagery tiles
- 6 node riak cluster with a ring size of 64 (or 64 partitions) and a n_val of 3 meaning that 3 copies of each partition is distributed across the 6 nodes
- haproxy for load distribution across all nodes
- riak is configure with the leveldb backend with anti-entropy turned on
- since we are using only 64 partitions each partition can be fairly large, riak attempts to balance the partition across the nodes, so in our case 4 nodes have 16% of the partitions and 2 nodes have 19%.  This can  significant depending on the size of each partition (number of objects / tiles), some nodes may require significantly more storage.  To get around this issue you could have a higher number of partitions resulting in a more equal distribution of partitions, however once the cluster is set up it almost impossible to change this value.
- very easy to add another node, riak will redistribute the partitions automatically
- riak stores the tile using a key / object like couchdb, mapproxy can use a secondary index to help with queries
- riak uses the concept of a bucket, kind of a container to hold your keys.  Mapproxy uses the bucket to hold a cache, 1 bucket=1 cache
- very fast read / writes compared to other cache types, we see 100+ tile / second when cache without pushing it hard (transfer of caches from sqlite to riak).
- can be problematic deleting / updating your cache depending on your use case.  You cannot simply delete a bucket/cache but need to delete each object.  If you try to use the mapproxy cleanup it may take forever to complete as it needs to cycle through each possible key and query riak.  We've had to write python scripts that utilize the secondary index to retrieve lists of keys and then delete each key / object in the list.
- riak logging is terrible so it can be very difficult to troubleshoot issues, there is no logging of request / errors for client applications, so once in a while you'll see a timeout and you'll have no idea if is a riak or network issue
- been very reliable so far
- basho the company that developed riak went bankrupt a few years ago but there is still some active development, it was taken over by bet360

On Tue, 10 Mar 2020 at 03:53, Jeff Konnen <[hidden email]> wrote:
Hi all,

we host several of hundreds of caches in a fie system that resides on a NFS disk which is mounted on different hosts with mapproxy containers accessing these files.

It works well but we are hitting some I/O Problems on the NFS server. That's why we are considering using a distributed system right now.

Does anyone have experience with mapproxy on riak, s3 (minio?) ?

What would your recommendation be?

We don't have any of these components running today, so given that we would have to create a new infrastructure from scratch, which one would you recommend?

Best regards
Jeff

--
Jeff Konnen
_______________________________________________
MapProxy mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/mapproxy

_______________________________________________
MapProxy mailing list
[hidden email]
https://lists.osgeo.org/mailman/listinfo/mapproxy