miércoles, 22 de febrero de 2017

Script - MongoDB mongorestore performance


The following script is necessary if we want a great level of parallelism in the full restoration of a mongodb database.

For mongodb databases (without sharding) we can use the following technique to improve the performance of full type restorations.

It is based on a vertical scaling for mongodb databases without sharding (I know that mongo was not designed for vertical scaling, but sometimes you need this ...), so, if we deploy a server (aws or similar) with a lot of cpu threads, memory and iops capacity, we can get very low recovery times.

In this case we will avoid that mongorestore rebuild the indexes, so that it is the system itself that updates them as records are recovered, having them created before mongorestore.


  • Index definition extraction (from same database or another with similar dbname, collections and indexes)

echo "db.getCollectionNames().forEach(function(c) { if (c.indexOf('system.') == -1)
{ind=db.getCollection(c).getIndexes();
print('db.runCommand({ createIndexes:\"'+c+'\",indexes:'+JSON.stringify(ind)+'});');
}
});"|mongo <server>:<port>/<database> --username=<username> --password=<password> --authenticationDatabase=admin >> indexes.txt


  • Collections drop

echo "db.getCollectionNames().forEach(function(c) { if (c.indexOf('system.') == -1) db[c].drop();});"|mongo <server>:<port>/<database> --username=<username> --password=<password> --authenticationDatabase=admin

  • Index creation (without collections, empty database)

cat indexes.txt|mongo <server>:<port>/<database> --username=<username> --password=<password> --authenticationDatabase=admin

  • MongoRestore command without indexes and with high degree of parallelism

mongorestore -u <username> -p <password> --authenticationDatabase=admin <backup_dir>/<database> -d <database> --numParallelCollections 5 --numInsertionWorkersPerCollection 20 --gzip --noIndexRestore