Overpass API developpement

Text archives Help


Re: [overpass] compressed database info


Chronological Thread 
  • From: Donal Diamond <donal.diamond@gmail.com>
  • To: overpass@listes.openstreetmap.fr
  • Subject: Re: [overpass] compressed database info
  • Date: Thu, 26 Feb 2015 17:56:59 +0000

Thanks for filling me in.

I applied the patches from minor_issues to my backend_compression branch and built following http://wiki.osm.org/wiki/Overpass_API/install#Installation

I imported by doing:

init_osm3s.sh ireland-and-northern-ireland.osm.bz2  XXX/db/ XXX/app/osm3s/ --meta

I'm not using attic or augmented_diffs.

I'm seeing some space gain   from about 11GB -> 9 GB

new:
96M     nodes.bin
8.0K    nodes.bin.idx
8.0G    nodes.map
204K    nodes.map.idx
68M     nodes_meta.bin
12K     nodes_meta.bin.idx
64M     ways.bin
4.0K    ways.bin.idx
1.2G    ways.map
20K     ways.map.idx
11M     ways_meta.bin
4.0K    ways_meta.bin.idx

old:
267M    nodes.bin
8.0K    nodes.bin.idx
8.0G    nodes.map
208K    nodes.map.idx
533M    nodes_meta.bin
12K     nodes_meta.bin.idx
202M    ways.bin
8.0K    ways.bin.idx
1.3G    ways.map
20K     ways.map.idx
44M     ways_meta.bin
4.0K    ways_meta.bin.idx

Most of my space usage is from the nodes.map file which hasn't changed  - it is possible or feasible to compress the map files also?

Donal


On 26 February 2015 at 05:13, Roland Olbricht <roland.olbricht@gmx.de> wrote:
Hi,

Noticed on earlier thread a mention of  a compressed database.

https://github.com/drolbr/Overpass-API/tree/backend_compression

Just looking for some introduction information on this.

How much space does it save? Does it have a performance impact?  How bleeding
edge is the code?

The code is up to date. It has run for a non-public test updates for some weeks, without any problems. Nonetheless, there are other known bugs, so please consider to merge the "minor_issues" branch. The "repair_attic_updates" branch only affects you if you use attic data.

The whole world database including attic has a compressed size of 200 GB, as opposed to 500 GB for the uncompressed database.

The performance of the code is not known yet, but the first tests suggest that it is similar to the uncompressed code.

Both the performance and the compression rate can likely be improved by adjusting the values in settings.cc, but I haven't tried yet. If you see mostly processor load, then you could replace every 512*1024 by 256*1024 to reduce the amount of data to process. If the performance is limited by disk latency then it may help to even increase that value to 1024*1024. Likewise, replacing the divisor (is 4 at the moment, in get_block_size() and get_max_size() ) by doubling may improve compression rate but raise latency.

In general these adjustment tests are a lot of work because it needs a database rebuild each time.

Just wondering as I have a Ireland only hourly updated overpass instance and
I'm getting tight on diskspace ;-)

I encourage you to run the compressed version if you could live a day or so without the database, because of the necessary database rebuild.

Best regards,

Roland





Archive powered by MHonArc 2.6.18.

Top of page