miércoles, 14 de febrero de 2018

Procedure - Business dashboards with Pentaho PDI and Geckoboard



In this post we will describe a low-cost, fast and efficient alternative to Enterprise dashboards.

This alternative is based on the use of the cloud platform Geckoboard.

If we look what used to be the dashboards solutions for companies, although there will be many more cases or situations, we normally find these:

  • Implementation of complete BA (business analytics) systems that cover all the needs we may require (etl, reporting, analytics, dashboards, mining, kpis, ...), but we do not really enjoy the full potential, far exceed the requirements since the final objective is to have only some panels for operations, administration, etc.
  • Ad-hoc programmatic solutions, usually in dashboard-specific web applications or integrated into main applications, in this case, perfectly meets the objective of having only what interests (dashboards), but at a fairly high cost.
  • Individual installations of specific products for generating statistics and dashboards, usually low-cost desktop tools, here the problem is usually that we are not talking about a business solution, but rather a tool for limited use and that may require certain knowledge or training to be used by users.
...

In short, many scenarios and possibilities.

At Sepalo we work and encourage the use of the cloud for both infrastructure and for specific services.

With a customer who is only interested in business dashboards, we opted to outsource the service to platforms such as Geckoboard, which has many advantages, among which we can highlight:

  • Reduced usage costs, (sorry, it isn´t open source)
  • Generation of dashboards via web, quickly and easily
  • Web visualization for the whole company and on any device

How about security?
In the implementations that we are carrying out, what we do is publish the data in geckoboard, that is, we do not have to open our systems to be accessed by geckoboard, we are the ones who send the information to geckoboard.

Publishing data in geckoboard?
Yes, there are several methods, it could be done in an integrated way in the applications, through the apis provided by geckoboard, but we have separated and isolated these publishing processes through specific ETL processes that are responsible for collecting the information and publishing it in geckoboard.

ETLs for publishing web information?
Yes, although it usually comes to mind that an ETL process is only in charge of moving data from one database to another, or perhaps uploading / downloading information to databases through files, current data integration systems can do a lot more, in fact, many times we try to develop in the applications complex functionalities that are already present in ETL systems, so it would be a good idea to know the capabilities of these systems to avoid unnecessary development costs.

Ok, after the speech, yes, in Sepalo we are using with a lot of acceptance data integration processes based on Pentaho PDI (open source), specifically for the publication of data in geckoboard through its apis.

Having this process separately and parameterized, allows us greater control and versatility, thus, we can point the process to a database or another, quickly publish new data sets in geckoboard, change refresh intervals, etc.

We can see graphically the implementation of these processes of pentaho PDI in the following screenshots:
 





In addition, a simple architecture diagram of a dashboard solution with geckoboard in the following graph:



In short, a low cost, fast to deploy and efficient solution for business dashboards, basically, what was indicated at the beginning of the post ...

How much does it really cost?
Here we have to include the own costs of the use of the geckoboard platform, these prices can be checked in the geckoboard site pricing section, in addition it is necessary to add the costs of design / implantation of ETL processes for data publication, which will depend on the information to be published, the number of datasets, how parameterized or metadated that we want everything to be, etc. Besides, there is a small part that has to be taken into account and would be the final design of the dashboards from the published datasets.

The ideal and recommended is that for the tasks of design / implementation of ETL processes and for the design of the final dashboards contact a trusted supplier (Sepalo) who can estimate the work and offer a closed budget for such tasks.

sábado, 10 de febrero de 2018

Procedure - Asterisk PBX business automatization with API calls

Could we trigger a business process by a telephone call? yes, we can, with asterisk.



Here we are going to show the basic installation and configuration steps to allow a inconming phone call with DTMF (Dual-Tone Multi-Frequency) to execute an API call to our systems to perform some action.

  • Installation

Basic asterisk installation is done in debian OS (e.g. Ubuntu) simply by executing:

apt-get update -y
apt-get upgrade -y
apt-get install asterisk -y

  • Locution generation

Here, we´ll save in a tipical .wav format some locutions in order to ask interactivelly (IVR) for the code_number like "Please insert the confirmation code" and a finish message in order to close the "conversation", like "thanks, your request has been processed". These media files will be converted to .gsm format by executing sox command in the asterisk server:

sox loc.code_number.wav -t gsm -r 8000 -c 1 /usr/share/asterisk/sounds/en/loc.code_number.gsm
sox loc.finish.wav -t gsm -r 8000 -c 1 /usr/share/asterisk/sounds/en/loc.finish.gsm

  • Configuration

We will use old .conf method to configure our phone call flow, but there are another ways/scriptting methods you can use:

echo -e "[general]
fullname = General
secret = 1234
hassip = yes
context = general_IVR
host = dynamic" > /etc/asterisk/users.conf

echo -e "[general]
bindaddr = 0.0.0.0
host=dynamic
type=friend
allow=all
dtmfmode=info" > /etc/asterisk/sip.conf

echo -e "[general_IVR]
exten => _.,1,Answer()
exten => _.,n,Read(test,loc.code_number,5,skip,2,15)
exten => _.,n,Set(CURL_BODY={\"code_number\":\${test},\"type\":\"phone_general\"})
exten => _.,n,Set(CURL_RESULT=\${CURL(https://postman-echo.com/post,\${CURL_BODY})})
exten => _.,n,Playback(loc.finish,skip)
exten => _.,n,Hangup()" > /etc/asterisk/extensions.conf


  • Phone number

To gain access by a DDI (direct dial-in) telephone number, we´ll contact some Voip provider to give us a phone number and redirect to our SIP configuration (ip, port) with the configured credentials (tipically applying TLS ans RSTP encryption options). However, you can use the Asterisk system from a voip phone client software (phonerlite, zoiper, ...) without any DDI number asociated, so, with no extra charges involved.
 
  • Key benefits

No special apps required to automate things from your mobile phone (you can use any voip app aplication that supports SIP servers to connect with your install).

You can automate internal company processes with simple external phone calls.

You can increase productivity by routing users to destination extensions without human intervention (PBX basic usage, "press 1 for sales, 2 for delivery ...").

You could create an IVR (Interactive voice recognition) systems, (that will require additional external voice recognition capabilities to be configured).


Please, feel free to contact with Sepalo Software if you need more information.


martes, 6 de febrero de 2018

Script - MongoDB incremental update preparation

Quick test, please answer:

How do you get prepared for a MongoDB incremental update?

    a) No special preparation, because I trust ... (developer´s team, QA´s team, cloud backups/snapshots, myself if I prepared the update, ...)
    b) With previous backup of the whole database
    c) With previous backup of items to be changed by the update

Well, if your answer is "a", good luck, you will need it, although you have an extremely powerful hardware provider or solution (Cloud, on-premise) with multiple recovery options (snapshots, replication, ...) or even good working database auditing options (with mongo those options comes with the paid version of the engine), it could be difficult to rollback a specific incremental update, moreover if new "good" transactions come after the point of the incremental update. Most of those options will require additional work and resources to be practical for our rollback required scenario.

Everyone makes mistakes, (even the good DBAs ...), so, you should be prepared for these situations.
   
The best approach should be the "b" and "c" options, if you can afford to make a full backup previously to any incremental update, please, do it, but this isn´t always possible because the time and resources consuming of a whole database backup.

Another option, could be to have incremental backups, so you could recover an initial photo of your database and go ahead until the time of the incremental update.

Apart from those options, you could have a simple script to make a pre-backup of the docs to be changed by the update.

This will allow you another way to recover (plus the backups one as mentioned early) in case of an update rollback required.

In Sepalo Software, additionally to periodic backups, incremental backup, cloud snapshot and backup options, we use an automated version of the following script, which is integrated in our devops cycle (automatic incremental update executions, database automatic versioning, ...):
   
-- Pre-backup

limit=0;
skip=0;
sort={};
filt={};
coll="test";
tag="TICKET-123"
db.getCollection(coll).find(filt).sort(sort).skip(skip).limit(limit).forEach(function(x){
    db.updateBackups.insert({"id":tag, "collection": coll, "date":  new Date(), "data":x});
});   
print('Prepared backups of collection:'+coll+',with filter:'+filt+',with identification:'+tag);


-- Rollback

tag="TICKET-123"
db.updateBackups.find({"id":tag}).forEach(function(x){
    db.getCollection(x.collection).remove({"_id":x.data._id},{"multi":false});
    db.getCollection(x.collection).insert(x.data);
});   
print('Rollbacked partial backup with identification:'+tag);


So, for an incremental update like this:

db.users.update({type:"preAuthorized"},{$set:{type:"authorized"}},{multi:true});

Developers should generate a RFC script (id ticket: PRJ-1234) like this:

//Pre-backup
limit=0;
skip=0;
sort={};
filt={type:"preAuthorized"};
coll="users";
tag="PRJ-1234"
db.getCollection(coll).find(filt).sort(sort).skip(skip).limit(limit).forEach(function(x){
    db.updateBackups.insert({"id":tag, "collection": coll, "date":  new Date(), "data":x});
});   
print('Prepared backups of collection:'+coll+',with filter:'+filt+',with identification:'+tag);
//update
db.users.update({type:"preAuthorized"},{$set:{type:"authorized"}},{multi:true});


In case the incremental update needs to be rolled back, we´ll just need to execute:

tag="PRJ-1234"
db.updateBackups.find({"id":tag}).forEach(function(x){
    db.getCollection(x.collection).remove({"_id":x.data._id},{"multi":false});
    db.getCollection(x.collection).insert(x.data);
});   
print('Rollbacked partial backup with identification:'+tag);

That will recover old docs affected by the incremental update scripts.

With this type of scripts, we have an automated and quick way to provide "rollback" script to our incremental updates.

Please feel free to contact Sepalo Software if you want more information about devops automation, increase performance of these scripts (bulk options, parallelizing) or database automatic versioning.


viernes, 2 de febrero de 2018

Procedure - (Node + MongoDB) Application Performance BenchMarking with Jmeter + Grafana + collectd

Well, the post subject is very illustrative ...

How do you benchmark your app/database tiers?, there is a lot of tools to do that, you just have to search for those procedures that fits your needs.

Our typicall test environment is like this:



In Sepalo Software we are using Jmeter as our reference benchmarking tool, most of the apps can be stressed by API calls that we can easy generate with Jmeter. Apps are typically Node apps (Cloud based) with MongoDB (Cloud based) as backend database.

Our first goal was to parametrize all the api calls scenarios by grouping them and generating a lot of parameters that can be passed to the jmeter executable to modify the benchmark scenario as we need (more update calls vs insert calls, ramp users, test interval, burst intervals, and so on) and even the type of scenario itself (Stability, Stress, Burst).

We met this goal by using jmeter parameters (e.g." ${__P(users,1)}" ) that we can specify in command line jmeter executable call.

With these parameters we can execute multiple test scenarios:
  • Stability Test scenario that is mean to verify if application can continuously perform well within or just above the acceptable period. It is often referred as load or Endurance testing.
  • Burst Test scenario is a type of performance testing focused on determining an application’s robustness, availability, and reliability under peak load conditions. The goal of burst testing is to check if application performance recovers after the peak conditions.
  • Stress Test scenario is a type of performance testing focused on determining an application’s robustness, availability, and reliability under extreme conditions. The goal of stress testing is to identify application issues that arise or become apparent only under extreme conditions.

An also, we can have multiple different configurations within a single scenario (distinct percentage api calls of behaviors of each request).

After these automated and configurable jmeter benchmarks test, we realized that most of our test time was focused in the test data results preparation, where we spend time taking csv outputted files and making spreadsheets dynamic tables and graphics.

So we just configured Grafana as our time dasboard based solution to show the tests results, which are saved automatically by jmeter in a influxDB database. For OS statistics and metrics, we use collectd that automatically save all servers statistics in the influxDB. All these tools are open source (which does not mean free).

Finally, we have customizable web based results like these





With this test environment configuration (Cloud based of course ...) we are spending our time in the important phase of the whole process, the initial API calls definitions to stress the apps.

Please, feel free to contact with Sepalo Software if you need more information.