Building a recommender service for your apps : Part III of III : Collecting data from Event API

You need to install all the requirements as mentioned here . You would also need to run the machine learning server that can be followed here.

You can use this event API to connect to your apps to show the recommendations. All organised!

The events are the set of specific recommendations based on specific data modeled. For customizing a set of your algorithms, I would write some advanced posts later.


//Launching the event server

$ cd $PIO_HOME
$ bin/pio eventserver

//HTTP GET

$ curl -i -X GET http://localhost:7070/events/<your_eventId>.json

// HTTP DELETE

$ curl -i -X DELETE http://localhost:7070/events/<your_eventId>.json

See you guys later!

Building a recommender service for your apps : Part II of III – Running Machine Learning Server

You need to install some specific requirements that you can check here in Part I of this post. I would be using Python SDK.

Step 1 : Run your first project


$mkdir appone

$cd appone

$easy_install predictionio

Step 2 : Create seed.py


import predictionio
import random

random.seed()

client = predictionio.EventClient(app_id=getitfromtheapiaccess)

# generate 10 users, with user ids 1,2,....,10
user_ids = [str(i) for i in range(1, 11)]
for user_id in user_ids:
print "Set user", user_id
client.set_user(user_id)

# generate 50 items, with item ids 1,2,....,50
# assign type id 1 to all of them
item_ids = [str(i) for i in range(1, 51)]
for item_id in item_ids:
print "Set item", item_id
client.set_item(item_id, {
"pio_itypes" : ['1']
})

# each user randomly views 10 items
for user_id in user_ids:
for viewed_item in random.sample(item_ids, 10):
print "User", user_id ,"views item", viewed_item
client.record_user_action_on_item("view", user_id, viewed_item)

client.close()

Step 3 : Collecting data from PIO


$ $PIO_HOME/bin/pio eventserver

$ python seed.py

Step 4 : Deploy instance for appone


$ $PIO_HOME/bin/pio instance io.prediction.engines.itemrank
$ cd io.prediction.engines.itemrank
$ $PIO_HOME/bin/pio register

//Edit params/datasource.json & modify the value of AppID

Step 5: Train & Deploy


$ $PIO_HOME/bin/pio train

//If training is successful

$ $PIO_HOME/bin/pio deploy

Step 6 : Create results.py


import predictionio

client = predictionio.EngineClient()

# Rank item 1 to 5 for each user
item_ids = [str(i) for i in range(1, 6)]
user_ids = [str(x) for x in range(1, 11)]
for user_id in user_ids:
print "Rank item 1 to 5 for user", user_id
try:
response = client.send_query({
"uid": user_id,
"iids": item_ids
})
print response
except predictionio.PredictionIOAPIError as e:
print 'Caught exception:', e.strerror()

client.close()

$ python results.py

Setting Apache Hadoop on Nitrous.IO

Once you have setup any language box (I choose to setup node box) proceed to run the following :


$cd workspace

$ wget http://mirror.nus.edu.sg/apache/hadoop/common/hadoop-2.5.1/hadoop-1.2.1.tar.gz

$ssh-keygen -t dsa -P '' -f ~./ssh/id_dsa

$cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

$chmod 600 ~/.ssh/authorized_keys

$vim hadoop-1.2.1/config/hadoop-env.sh

//Here we set the jvm path. You need to check using $java -version else it would give a error saying JAVA_HOME not set.

// export JAVA_HOME = /usr/lib/jvm/java-7-oracle

$ bin/hadoop namenode -format

$ bin/hadoop fs -mkdir input

$ bin/hadoop fs -put conf input

$ bin/hadoop fs -cp conf/*.xml input

$ bin start-all.sh

$bin/hadoop jar hadoop-examples-1.2.1.jar grep input output 'dfs[a-z.]+'

$bin/hadoop fs -rmr output

$bin/hadoop jar hadoop-examples-1.2.1.jar wordcount input output

$bin/hadoop fs -rmr output

$bin/stop-all.sh

Now you need to configure core-site.xml


<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/action/tmp</value>
</property>
</configuration>

also hdfs-site.xml


<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

Lastly mapred-site.xml


<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>

Thats all! Just run http://localhsot:50030 for Hadoop.

Building a recommender service for your apps : Part I of III – Introduction and Installation

Lately have been working on a small project in my free time that focused on providing recommendations to my other app projects (earlier) . The recommendation engine need to run a set of algorithms modeled and then learn the real time inputs by the user. Since my project is not opensource so I can just share how to create a simple machine learning server for your apps (the stack is opensource).

STEP 1 : INTRODUCTION – RECOMMENDER SYSTEMS

  • What is a Recommender system ?
  1. It is a subclass of Information filtering system that seek to predict the ‘rating’ or ‘preference’ that user would give to an item.
  2. There are three approaches to the same i.e Collaborative Filtering, Content based Filtering and Hybrid recommender systems.
  3. The accuracy of these systems depend upon the accuracy of the recommendation algorithms.
  4. Data is always the issue. It may have privacy concerns.
  • Collaborative filtering in nutshell : 
  1. Collect and analyze large amounts of data on “User behavior”.
  2. It can be used to build complex systems in which a user can be recommended a product without understanding the product itself.
  3. It works on the rule – If people agreed in the past they would agree in the future as-well.
  4. User behavior profile is created by making a distinction between Explicit and Implicit data collected from the user actions.
  5.  The system basically compares : Similar data from collected to dissimilar data from others.
  6.  Three issues it suffers from : “Cold start” i.e it requires a large amount of data on existing user to make accurate recommendations. “Scalability” i.e large amount of computation is required. Lastly, “Sparsity” i.e  the user inputs are very less making it a sparse matrix.
  7. It can be classified into : “Memory based” & “Model based” .
  • Content based filtering in nutshell :
  1. Focuses on description of item and profile of user preference.
  2. The system basically works : Keywords are used to describe the items. Profile of a user is created based on the items it likes. then both are matched to show a recommendation list.
  3. The user input is the quantification of the items the user likes. A specific model is triggered based on user interaction.
  4. This method uses extensive Information retrieval and information filtering. Items are profiled maybe by using tags etc.
  5. The content based profile is created by the “Weighted vector of item features”. The weight denote the items importance.
  6. It is not always that a user will respond in same way for other content profiles. Although he might respond similar to similar content based profiles.
  • Hybrid recommender systems in nutshell 
  1. Combine the above two approaches to reduce issues of both the approaches. When CF & CBF are combined the issues like cold start and sparsity reduces to least.
  2. Popular techniques are : Weighted, switching, mixed, feature combination, feature augmentation, cascading and meta level.

STEP 2 : INSTALLATION REQUIREMENTS

(1) We would be setting the server on windows. You can setup the same on Ubuntu or your AWS instance.

(2) We need to install some specific requirements i.e :

  • Apache Hadoop – The data distributed processing framework
  • Apache HBase – The distributed column oriented framework
  • Apache Spark for Hadoop – Large scale data processing engine on top of hadoop
  • Elastic-search – Opensource search and analytics engine
  • Java 7 – since everything runs as java process this is most important.

STEP 2.1 – Installing Apache Hadoop

  • We would be setting up Apache Hadoop locally on your Win32 machine. This is required only if you are running distributed and complex recommendations for your application.
  • Install ssh on your local machine. You can install Putty or Open-ssh. You can check this link.
  • Install cygwin. This provided Linux like functionality to your windows devices.
  • Download a stable hadoop release here.
  • Unpack it and edit – config/hadoop-env.sh to define JAVA_HOME. To check run

$bin/hadoop

  • You can now run in three modes – Local, Pseudo-distributed or fully-distributed. We would run on a standalone as we need to run as a non-distributed single java process.
  • To see in action : run this example that uses input and then finds and displays every match of given regular expression to the given output dictionary.

$ mkdir input
$ cp conf/*.xml input
$ bin/hadoop jar hadoop-examples-*.jar grep input output 'dfs[a-z.]+'
$ cat output/*

STEP 2.2  – Setting up Apache Hbase

  • Download from this link.
  • Unpack and run with cygwin. The configuration is for Java, SSH and HBase.

 STEP2.2.1 – Configuring Java

  • Create a link in /usr/local to the Java home directory by using the following command and substituting the name of your chosen Java environment:

LN -s /cygdrive/c/Program\ Files/Java/<jre name> /usr/local/<jre name>

  • Test your java installation by changing directories to your Java folder CD /usr/local/<jre name> and issueing the command ./bin/java -version. This should output your version of the chosen JRE.

 STEP2.2.2 – Configuring SSH

  • Configuring SSH is quite elaborate, but primarily a question of launching it by default as a Windows service.
  • On Windows Vista and above make sure you run the Cygwin shell with elevated privileges, by right-clicking on the shortcut an using Run as Administrator.
  • First of all, we have to make sure the rights on some crucial files are correct. Use the commands underneath. You can verify all rights by using the LS -L command on the different files. Also, notice the auto-completion feature in the shell using <TAB> is extremely handy in these situations.

chmod +r /etc/passwd to make the passwords file readable for all

chmod u+w /etc/passwd to make the passwords file writable for the owner

chmod +r /etc/group to make the groups file readable for all

chmod u+w /etc/group to make the groups file writable for the owner

chmod 755 /varto make the var folder writable to owner and readable and executable to all

  • Edit the /etc/hosts.allow file using your favorite editor (why not VI in the shell!) and make sure the following two lines are in there before the PARANOID line:

ALL : localhost 127.0.0.1/32 : allow

ALL : [::1]/128 : allow

  • Next we have to configure SSH by using the script ssh-host-config

If this script asks to overwrite an existing /etc/ssh_config, answer yes.

If this script asks to overwrite an existing /etc/sshd_config, answer yes.

If this script asks to use privilege separation, answer yes.

If this script asks to install sshd as a service, answer yes. Make sure you started your shell as Adminstrator!

If this script asks for the CYGWIN value, just enteras the default is ntsec.

If this script asks to create the sshd< account, answer yes.

If this script asks to use a different user name as service account, answer no as the default will suffice.

If this script asks to create the cyg_server account, answer yes. Enter a password for the account.

  • Start the SSH service using net start sshd or cygrunsrv --start sshd. Notice that cygrunsrv is the utility that make the process run as a Windows service. Confirm that you see a message stating that the CYGWIN sshd service was started succesfully.
  • Harmonize Windows and Cygwin user account by using the commands:

mkpasswd -cl > /etc/passwd

mkgroup --local > /etc/group

  • Test the installation of SSH:
    1. Open a new Cygwin terminal
    2. Use the command whoami to verify your userID
    3. Issue an ssh localhost to connect to the system itself
      1. Answer yes when presented with the server’s fingerprint
      2. Issue your password when prompted
      3. test a few commands in the remote session
      4. The exit command should take you back to your first shell in Cygwin
    4. Exit should terminate the Cygwin shell.

 STEP2.3 – Configuring Hbase

  • If all previous configurations are working properly, we just need some tinkering at the HBase config files to properly resolve on Windows/Cygwin. All files and paths referenced here start from the HBase [installation directory] as working directory.
  • HBase uses the ./conf/hbase-env.sh to configure its dependencies on the runtime environment. Copy and uncomment following lines just underneath their original, change them to fit your environemnt. They should read something like:

</div>
<div class="section">

<tt>export JAVA_HOME=/usr/local/<i><jre name></i></tt>

<tt>export HBASE_IDENT_STRING=$HOSTNAME</tt> as this most likely does not inlcude spaces.

  • HBase uses the ./conf/hbase-default.xml file for configuration. Some properties do not resolve to existing directories because the JVM runs on Windows. This is the major issue to keep in mind when working with Cygwin: within the shell all paths are *nix-alike, hence relative to the root /. However, every parameter that is to be consumed within the windows processes themself, need to be Windows settings, hence C:\-alike. Change following propeties in the configuration file, adjusting paths where necessary to conform with your own installation:

<tt>hbase.rootdir</tt> must read e.g. <tt>file:///C:/cygwin/root/tmp/hbase/data</tt>

<tt>hbase.tmp.dir</tt> must read <tt>C:/cygwin/root/tmp/hbase/tmp</tt>

<tt>hbase.zookeeper.quorum</tt> must read <tt>127.0.0.1</tt> because for some reason <tt>localhost</tt> doesn't seem to resolve properly on Cygwin.

  • Make sure the configured hbase.rootdir and hbase.tmp.dir directories exist and have the proper rights set up e.g. by issuing a chmod 777 on them.
  • This should conclude the installation and configuration of Apache HBase on Windows using Cygwin. So it’s time to test it.
  1. Start a Cygwin terminal, if you haven’t already.
  2. Change directory to HBase installation using CD /usr/local/hbase-<version>, preferably using auto-completion.
  3. Start HBase using the command ./bin/start-hbase.sh
    1. When prompted to accept the SSH fingerprint, answer yes.
    2. When prompted, provide your password. Maybe multiple times.
    3. When the command completes, the HBase server should have started.
    4. However, to be absolutely certain, check the logs in the ./logs directory for any exceptions.
  4. Next we start the HBase shell using the command ./bin/hbase shell
  5. We run some simple test commands
    1. Create a simple table using command create 'test', 'data'
    2. Verify the table exists using the command list
    3. Insert data into the table using e.g.
    4. List all rows in the table using the command scan 'test' that should list all the rows previously inserted. Notice how 3 new columns where added without changing the schema!
    5. Finally we get rid of the table by issuing disable 'test' followed by drop 'test' and verified by list which should give an empty listing.
  6. Leave the shell by exit
  7. To stop the HBase server issue the ./bin/stop-hbase.sh command. And wait for it to complete!!! Killing the process might corrupt your data on disk.
  8. In case of problems,
    1. verify the HBase logs in the ./logs directory.
    2. Try to fix the problem
    3. Get help on the forums or IRC (#hbase@freenode.net). People are very active and keen to help out!
    4. Stopr, restart and retest the server.

 STEP2.4 – Installing Apache spark

  • The link to follow is here.

$ wget http://d3kbcqa49mib13.cloudfront.net/spark-1.1.0-bin-hadoop2.4.tgz
$ tar zxvf spark-1.1.0-bin-hadoop2.4.tgz

SPARK_HOME=/home/abc/Downloads/spark-1.1.0-bin-hadoop2.4

$ wget https://archive.apache.org/dist/hbase/hbase-0.98.6/hbase-0.98.6-hadoop2-bin.tar.gz
$ tar zxvf hbase-0.98.6-hadoop2-bin.tar.gz
$ cd hbase-0.98.6-hadoop2-bin

//Edit conf/hbase-site.xml

<configuration>
<property>
<name>hbase.rootdir</name>
<value>file:///home/abc/hbase</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/abc/zookeeper</value>
</property>
</configuration>

//Edit conf/hbase-env.sh

export JAVA_HOME=<code>/usr/libexec/java_home -v 1.7</code>
$ bin/start-hbase.sh

 STEP2.4 – Installing Elastic-search

  • The link to follow is here.

$ wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.3.2.tar.gz
$ tar zxvf elasticsearch-1.3.2.tar.gz
$ cd elasticsearch-1.3.2
$ bin/elasticsearch

Getting things done ! Lessons learnt !

241f4eaa55ea72fd21ebeb09728b651d-2

Most of the people I meet these days are struggling to make choices about themselves and more about the future they would have. I am more of a logical person but I surely know karma is bitch.

I maybe rude with most people around but the golden rules in life are very straight forward – no one gives anything to anyone but you can have everything if you work hard towards achieving it.

Keeping to the rules I believe that everyone should have big, focused but optimistic goals instead of big dreams. I see people brag about their dreams but life values those with goals and a plan chart to achieve them. Big as in very ambitious, focused as in very clear to know if you want them or not and lastly optimistic as the journey to achieve them would change time to time. Plan less and take action is what I have learnt from my last various failures. It’s a long journey and no one can be perfect but you can achieve maximum if you have the right agenda in mind.

Few simple things that I have learnt may help anyone achieve there goals :

(1) Plan yourself for next five years. Make it challenging for yourself to achieve something then put it on the silver table. Keep the goals aligned to your career, family and friends.

(2) Focused on your weakness and try to improve yourself being a better person everyday.

(3) Focus on the positive part in your life and make it the fuel to travel through this journey.

(4) Don’t be a pussy and run away from your problems. Face them and fail for good. Failure is good and necessary to pump yourself for next round.

(5) Always learn from your mistakes and try not to repeat them.

(6) Keep personal and professional life way apart.

(7) have patience till the deal is closed. The fast shortcut route always has a price, if you are willing to pay then it’s your choice.

Hope this helps!

Off to the mountains this weekend! Will post photos soon on Instagram 

Security in Node | Express

Another good topic and concerns from the last meetup  is the security of Express/Node applications.

You can download a Express ready skeleton/seed that has all this configuration setup for you here mentioned below. You can use it to start building your application right away.

This post is kind-of based on the observations that I collected from various data sources on the internet. I have also added a suitable conclusion based on the collection and analysis. So lets get started.

Step 1 : Follow best practices to actually solve most security issues

  • No root please : This is prefixed for you. Hey wait! What the hell it actually means? Some ports like 80 and 443 etc are privilege port numbers and they require root access. But why would use them, exactly you don’t have to as for noobs its already fixed by setting default as 3000. You can also use 8080 but not from any port till 1024. You can read this awesome stacker that tell why ports up-to 1024 have privileges.

Ok. Suppose you have to set the same on 0-1024 aka privilege ports you can use the node function i.e process.setuid() & process.setguid() after you have set the port in the app.js. This would allow a specific groupid or a uid that have lower privileges than root.

http.createServer(app).listen(app.get('port'), function(){
console.log("Express server listening on port " + app.get('port'));
process.setgid(config.gid);
process.setuid(config.uid);
});
  • Use HTTPS when dealing with User sessions : Remember my presentation where I was talking about using connect-mongo to save the session in MongoDB. Make sure you set the secure as true and HTTPonly as true as-well. This would allow to pass the session as HTTPS always. Making the secure as true will run with SSL.
app.use(express.session({
secret: "notagoodsecretnoreallydontusethisone",
cookie: {httpOnly: true, secure: true},
}));
  • Use Helmet for Security Headers : It has all these middle-wares that can help you implement various security headers to protect your app in various ways. To know about the various security headers that make a difference check here.
  1. csp (Content Security Policy)
  2. hsts (HTTP Strict Transport Security)
  3. xframe (X-Frame-Options)
  4. iexss (X-XSS-Protection for IE8+)
  5. ienoopen (X-Download-Options for IE8+)
  6. contentTypeOptions (X-Content-Type-Options)
  7. cacheControl (Cache-Control)
  8. crossdomain (crossdomain.xml)
  9. hidePoweredBy (remove X-Powered-By)

You should implement them as part of app.configure in app.js. Soon I would talk about how the various security headers work in general.

Although express has a inbuilt middle-ware that helps you protect from CSRF. Its not by default but you can use it if you want, just in case you want it to be secure. Apart from sarcastic jokes the code is as simple as it sounds. We use “csrftoken” to create a specific token for every template. Just check this very interesting post that tells how facebook solves the csrf/xsrf issues on its end.


app.use(express.csrf());
app.use(function (req, res, next) {
res.locals.csrftoken = req.session._csrf;
next();
});
app.use(helmet.xframe());
app.use(helmet.iexss());
  • Do you use the default error handlers : Yes this is created from Express V4 by default. Although you have to configure this if you use index.html than ejs/jade. But its not that tough.

Step 2 : Define your strategy for HTTP API done with Express :

Yes you got it right if you are thinking as back-end developer. If you don’t use these strategies then the correct phase would be “Shit just got real”. All your data objects that are stored in Data-store can be easily controlled or worst modified via the HTTP API that you implemented very beautifully.

  1. Use a Middle-ware that does authorization for you : Create a function that defines the state is authorized or unauthorized. Check “express-authorization” (or just use any other that meets your need) and just make a function access() and checkAuthorization()
  2. Now just use this function use app.use() i.e global so even if you define any specific REST resources for API guest endpoints would always be left.
  3. Define Guest endpoints.

//Define in app.js or server.js

var authorize = require('express-authorization');

function access(req, res, next) {
checkAuthorization(req, function (err, authorized) {
if (err || !authorized) {
res.send({message: 'Unauthorized', status: 401});
}

next();
});

function checkAuthorization(req, callback) {
//You have to do as per express-authorization API parameters and off-course as per your application.
authorize.ensureRequest.isPermitted("restricted:view")
}
}

//Define this is routes.js

function peopleApi(app) {
app.get('/api/people',
authorize.access,
getPeople);

app.get('/api/people/:id',
authorize.access,
getPerson);

app.post('/api/people',
authorize.access,
postPerson);

}

module.exports = peopleApi;

//Setting up Guest endpoints

function guest(req, res, next) {
req.guestAccess = true;
next();
}

app.get('/api/people/meta',
authorize.guest, // no authentication required!
getPeopleMeta);

// Define ApplyAuthentication function to app.js or server.js

applyAuthentication(app, ['/api']); // apply authentication here

//Define the specific authentication anywhere

var _ = require('underscore');
var middleware = require('../middleware');

function applyAuthentication(app, routesToSecure) {
for (var verb in app.routes) {
var routes = app.routes[verb];
routes.forEach(patchRoute);
}

function patchRoute (route) {
var apply = _.any(routesToSecure, function (r) {
return route.path.indexOf(r) === 0;
});

var guestAccess = _.any(route.callbacks, function (r) {
return r.name === 'guest';
});

if (apply && !guestAccess) {
route.callbacks.splice(0, 0, middleware.access.authenticatedAccess());
}
}
}

module.exports = applyAuthentication;

Step 3 : Don’t use body-parser()

Source : Here

  • If you go through the post and read that after using bodyparser() the number of temporary files are increased. The only valid question is how is that even a security concern.
  • I use some interesting cloud providers that provide me a limited space and yes if the rate which bodyparser() generates temp files it would make my server process to shutdown until extra space is  reconfigured. Halt in service leaves poor customer feedback.
  • Solution as mentioned is to clean the temp files.

Implementing Access control | Express – Node

Yesterday at the meetup someone asked me about the “Authorization” in your node applications. I thought its best to compile of good sources and clearly define the mindset so as one can implement the same.

If you have following the content or the presentation yesterday  then just to clarify the mindset before developing a MEAN app :

  1. On Back-end one creates a set of Restful services/API that forwards the data objects using or without authentication.
  2. On front-end side one creates a set of Angular based controllers + services that call the specific back-end API or authentically connect to the data binding resource.

Clear enough!? This works fine if you are authenticating a user session but what about authorization.

Step 1 : Using connect-roles to create roles in your application. 

This npm package is very good. Lets say we use PassportJS for authentication then simple steps would be :

//All the settings are in app.js or server.js

// We would define three roles – Public, Private as user and Admin. All the rules can be setup using the default API (here)


var authentication = require('passpotjs');
var ConnectRoles = require('connect-roles');
var express = require('express');
var app = express();

var user = new ConnectRoles({
failureHandler: function (req, res, action) {
// optional function to customise code that runs when
// user fails authorisation
var accept = req.headers.accept || '';
res.status(403);
if (~accept.indexOf('html')) {
res.render('access-denied', {action: action});
} else {
res.send('Access Denied - You don\'t have permission to: ' + action);
}
}
});

app.use(authentication)
app.use(user.middleware());

//anonymous users can only access the home page
//returning false stops any more rules from being
//considered
user.use(function (req, action) {
if (!req.isAuthenticated()) return action === 'access home page';
})

//moderator users can access private page, but
//they might not be the only ones so we don't return
//false if the user isn't a moderator
user.use('access private page', function (req) {
if (req.user.role === 'moderator') {
return true;
}
})

//admin users can access all pages
user.use(function (req) {
if (req.user.role === 'admin') {
return true;
}
});
app.get('/', user.can('access home page'), function (req, res) {
res.render('private');
});
app.get('/private', user.can('access private page'), function (req, res) {
res.render('private');
});
app.get('/admin', user.can('access admin page'), function (req, res) {
res.render('admin');
});

app.listen(3000);

Step 2 : Build your own authorization properties in Express applications

// Sofar the best implementation I have used is Drywall

//The key to such a specific implementation is creating multiple Views for every access property

// Once you have defined each and every section you need to modify the same using Routing i.e on specific authorization do a specific action.

#1 : You need to define the functions that take care of property action for authorization modules.

function ensureAuthenticated(req, res, next) {

if (req.isAuthenticated()) {
return next();
}
res.set('X-Auth-Required', 'true');
req.session.returnUrl = req.originalUrl;
res.redirect('/login/');
}

function ensureAdmin(req, res, next) {
if (req.user.canPlayRoleOf('admin')) {
return next();
}
res.redirect('/');
}

function ensureAccount(req, res, next) {
if (req.user.canPlayRoleOf('account')) {
if (req.app.config.requireAccountVerification) {
if (req.user.roles.account.isVerified !== 'yes' && !/^\/account\/verification\//.test(req.url)) {
return res.redirect('/account/verification/');
}
}
return next();
}
res.redirect('/');
}

#2 : Now just use simple authentication and authorization when implementing verbs for your REST API – get, put, post and delete.  I have removed all the code for general View rendering and Authentication.


exports = module.exports = function(app, passport) {

//admin

app.all('/admin*', ensureAuthenticated);
app.all('/admin*', ensureAdmin);
app.get('/admin/', require('./views/admin/index').init);

//admin > users
app.get('/admin/users/', require('./views/admin/users/index').find);
app.post('/admin/users/', require('./views/admin/users/index').create);
app.get('/admin/users/:id/', require('./views/admin/users/index').read);
app.put('/admin/users/:id/', require('./views/admin/users/index').update);
app.put('/admin/users/:id/password/', require('./views/admin/users/index').password);
app.put('/admin/users/:id/role-admin/', require('./views/admin/users/index').linkAdmin);
app.delete('/admin/users/:id/role-admin/', require('./views/admin/users/index').unlinkAdmin);
app.put('/admin/users/:id/role-account/', require('./views/admin/users/index').linkAccount);
app.delete('/admin/users/:id/role-account/', require('./views/admin/users/index').unlinkAccount);
app.delete('/admin/users/:id/', require('./views/admin/users/index').delete);

//admin > administrators
app.get('/admin/administrators/', require('./views/admin/administrators/index').find);
app.post('/admin/administrators/', require('./views/admin/administrators/index').create);
app.get('/admin/administrators/:id/', require('./views/admin/administrators/index').read);
app.put('/admin/administrators/:id/', require('./views/admin/administrators/index').update);
app.put('/admin/administrators/:id/permissions/', require('./views/admin/administrators/index').permissions);
app.put('/admin/administrators/:id/groups/', require('./views/admin/administrators/index').groups);
app.put('/admin/administrators/:id/user/', require('./views/admin/administrators/index').linkUser);
app.delete('/admin/administrators/:id/user/', require('./views/admin/administrators/index').unlinkUser);
app.delete('/admin/administrators/:id/', require('./views/admin/administrators/index').delete);

//admin > admin groups
app.get('/admin/admin-groups/', require('./views/admin/admin-groups/index').find);
app.post('/admin/admin-groups/', require('./views/admin/admin-groups/index').create);
app.get('/admin/admin-groups/:id/', require('./views/admin/admin-groups/index').read);
app.put('/admin/admin-groups/:id/', require('./views/admin/admin-groups/index').update);
app.put('/admin/admin-groups/:id/permissions/', require('./views/admin/admin-groups/index').permissions);
app.delete('/admin/admin-groups/:id/', require('./views/admin/admin-groups/index').delete);

//admin > accounts
app.get('/admin/accounts/', require('./views/admin/accounts/index').find);
app.post('/admin/accounts/', require('./views/admin/accounts/index').create);
app.get('/admin/accounts/:id/', require('./views/admin/accounts/index').read);
app.put('/admin/accounts/:id/', require('./views/admin/accounts/index').update);
app.put('/admin/accounts/:id/user/', require('./views/admin/accounts/index').linkUser);
app.delete('/admin/accounts/:id/user/', require('./views/admin/accounts/index').unlinkUser);
app.post('/admin/accounts/:id/notes/', require('./views/admin/accounts/index').newNote);
app.post('/admin/accounts/:id/status/', require('./views/admin/accounts/index').newStatus);
app.delete('/admin/accounts/:id/', require('./views/admin/accounts/index').delete);

//admin > statuses
app.get('/admin/statuses/', require('./views/admin/statuses/index').find);
app.post('/admin/statuses/', require('./views/admin/statuses/index').create);
app.get('/admin/statuses/:id/', require('./views/admin/statuses/index').read);
app.put('/admin/statuses/:id/', require('./views/admin/statuses/index').update);
app.delete('/admin/statuses/:id/', require('./views/admin/statuses/index').delete);

//admin > categories
app.get('/admin/categories/', require('./views/admin/categories/index').find);
app.post('/admin/categories/', require('./views/admin/categories/index').create);
app.get('/admin/categories/:id/', require('./views/admin/categories/index').read);
app.put('/admin/categories/:id/', require('./views/admin/categories/index').update);
app.delete('/admin/categories/:id/', require('./views/admin/categories/index').delete);

//admin > search
app.get('/admin/search/', require('./views/admin/search/index').find);

//account
app.all('/account*', ensureAuthenticated);
app.all('/account*', ensureAccount);
app.get('/account/', require('./views/account/index').init);

//account > verification
app.get('/account/verification/', require('./views/account/verification/index').init);
app.post('/account/verification/', require('./views/account/verification/index').resendVerification);
app.get('/account/verification/:token/', require('./views/account/verification/index').verify);

//account > settings
app.get('/account/settings/', require('./views/account/settings/index').init);
app.put('/account/settings/', require('./views/account/settings/index').update);
app.put('/account/settings/identity/', require('./views/account/settings/index').identity);
app.put('/account/settings/password/', require('./views/account/settings/index').password);

//account > settings > social
app.get('/account/settings/twitter/', passport.authenticate('twitter', { callbackURL: '/account/settings/twitter/callback/' }));
app.get('/account/settings/twitter/callback/', require('./views/account/settings/index').connectTwitter);
app.get('/account/settings/twitter/disconnect/', require('./views/account/settings/index').disconnectTwitter);
app.get('/account/settings/github/', passport.authenticate('github', { callbackURL: '/account/settings/github/callback/' }));
app.get('/account/settings/github/callback/', require('./views/account/settings/index').connectGitHub);
app.get('/account/settings/github/disconnect/', require('./views/account/settings/index').disconnectGitHub);
app.get('/account/settings/facebook/', passport.authenticate('facebook', { callbackURL: '/account/settings/facebook/callback/' }));
app.get('/account/settings/facebook/callback/', require('./views/account/settings/index').connectFacebook);
app.get('/account/settings/facebook/disconnect/', require('./views/account/settings/index').disconnectFacebook);
app.get('/account/settings/google/', passport.authenticate('google', { callbackURL: '/account/settings/google/callback/', scope: ['profile email'] }));
app.get('/account/settings/google/callback/', require('./views/account/settings/index').connectGoogle);
app.get('/account/settings/google/disconnect/', require('./views/account/settings/index').disconnectGoogle);
app.get('/account/settings/tumblr/', passport.authenticate('tumblr', { callbackURL: '/account/settings/tumblr/callback/' }));
app.get('/account/settings/tumblr/callback/', require('./views/account/settings/index').connectTumblr);
app.get('/account/settings/tumblr/disconnect/', require('./views/account/settings/index').disconnectTumblr);

//route not found
app.all('*', require('./views/http/index').http404);
};

Next Stop – Meteor JS

Realtime applications are the next cool and maybe big thing. I am still thinking to start with some cool d3 based meteor app for next NEAN meetup#2. The basic Idea would be :

  • User enters the sentiment
  • The Visualization would show the realtime twitter sentiment changes using Twit – Twitter API for Node (https://github.com/ttezel/twit)
  • The visualization would be map of singapore and twitter data would be realtime.

In meantime, you can grab the awesome screen-cast to start exploring Meteor as the app development framework. Pretty cool!

Developing the Live APP | NEAN meetup#1

Although I believe in writing to the point but for maintaining a blog I can be considered a lazy blogger as I micro write. I don’t have much time to write as a journalist but bear with me if you are grammatical centric.

I thought it would be great for all the developers to try and use the app (here) that would be live during the event  but for people who are new to it and want to make their own hack mash-ups – I have written a small to the point tutorial on how the app is made and it works. Do go through the presentation here : http://nean1.gautamanand.in

You can download the App here : https://github.com/ga1989/NEAN-1-LiveAPP-Angular-express

Step 1:  Setup


$ express -e NEANapp

$ cd NEANapp


//Configuring the Package.json

{
"name" : "NEANDemo",
"main" : "server.js",
"dependencies" : {
"express" : "~3.4.4",
"mongoose" : "~3.6.2"
}
}


$ npm install

// And now you have a successful express app.

Step 2 : Configure Express app – server.js

// We would define all the variables

// Configure port, routes, database and middle-wares


// set up ======================================================================
var express = require('express');
var app = express(); // create our app w/ express
var mongoose = require('mongoose'); // mongoose for mongodb
var port = process.env.PORT || 8080; // set the port
var database = require('./config/database'); // load the database config

// configuration ===============================================================
mongoose.connect(database.url); // connect to mongoDB database on modulus.io

app.configure(function() {
app.use(express.static(__dirname + '/public')); // set the static files location /public/img will be /img for users
app.use(express.logger('dev')); // log every request to the console
app.use(express.bodyParser()); // pull information from html in POST
app.use(express.methodOverride()); // simulate DELETE and PUT
});

// routes ======================================================================
require('./app/routes.js')(app);

// listen (start app with node server.js) ======================================
app.listen(port);
console.log("App listening on port " + port);

Step 3 : Configure Express App – ./app/models/todo.js

// Here would be define what json objects to store in data base.

// We just need to think of text that would act as Task entered and a Boolean variable that would   decide if the task has been done or not.


var mongoose = require('mongoose');

module.exports = mongoose.model('Todo', {
text : String,
done : Boolean
});

Step 4 : Configure Express App – ./app/routes.js

// We need to now define the REST API and its functions

// CRUD aka GET, PUT, POST and DELETE

// At the end we have also setup router path for ‘*’ to get the data to Angular code.

// Data passed to REST API is via AJAX calls done by Angular Guy.


var Todo = require('./models/todo');

module.exports = function(app) {

// api ---------------------------------------------------------------------
// get all todos
app.get('/api/todos', function(req, res) {

// use mongoose to get all todos in the database
Todo.find(function(err, todos) {

// if there is an error retrieving, send the error. nothing after res.send(err) will execute
if (err)
res.send(err)

res.json(todos); // return all todos in JSON format
});
});

// create todo and send back all todos after creation
app.post('/api/todos', function(req, res) {

// create a todo, information comes from AJAX request from Angular
Todo.create({
text : req.body.text,
done : false
}, function(err, todo) {
if (err)
res.send(err);

// get and return all the todos after you create another
Todo.find(function(err, todos) {
if (err)
res.send(err)
res.json(todos);
});
});

});

// delete a todo
app.delete('/api/todos/:todo_id', function(req, res) {
Todo.remove({
_id : req.params.todo_id
}, function(err, todo) {
if (err)
res.send(err);

// get and return all the todos after you create another
Todo.find(function(err, todos) {
if (err)
res.send(err)
res.json(todos);
});
});
});

// application -------------------------------------------------------------
app.get('*', function(req, res) {
res.sendfile('./public/index.html'); // load the single view file (angular will handle the page changes on the front-end)
});
};

Step 5 : Configure the Express App : ./config/database.js

// This is a simple exported module to define the database name. It can also be included in server.js

// The config folder can also be used to maintain test cases by jasmine. Test cases are good if you want to check for errors before deploying to a server.


module.exports = {

// the database url to connect
url : 'mongodb://localhost/Userdata'
}

Step 6 : Configuring the Angular App ./public/core.js

// This core is very straight forward. It lets you decide how to handle and sync with express REST API

//It lets you pass the same to index.html by directives, controllers and data injection


var NEAN = angular.module('NEAN', []);

function mainController($scope, $http) {
$scope.formData = {};

// when landing on the page, get all todos and show them
$http.get('/api/todos')
.success(function(data) {
$scope.todos = data;
})
.error(function(data) {
console.log('Error: ' + data);
});

// when submitting the add form, send the text to the node API
$scope.createTodo = function() {
$http.post('/api/todos', $scope.formData)
.success(function(data) {
$scope.formData = {}; // clear the form so our user is ready to enter another
$scope.todos = data;
console.log(data);
})
.error(function(data) {
console.log('Error: ' + data);
});
};

// delete a todo after checking it
$scope.deleteTodo = function(id) {
$http.delete('/api/todos/' + id)
.success(function(data) {
$scope.todos = data;
})
.error(function(data) {
console.log('Error: ' + data);
});
};

}

Step 6 : Configuring the Angular App  ./public/index.html


<!doctype html>

<!-- ASSIGN OUR ANGULAR MODULE -->
<html ng-app="NEAN">
<head>
<!-- META -->
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1"><!-- Optimize mobile viewport -->

<title>My Tasks Lists</title>

<!-- SCROLLS -->
<link rel="stylesheet" href="//netdna.bootstrapcdn.com/bootstrap/3.0.0/css/bootstrap.min.css"><!-- load bootstrap -->
<style>
html { overflow-y:scroll; }
body { padding-top:50px; }
#todo-list { margin-bottom:30px; }
#todo-form { margin-bottom:50px; }
</style>

<!-- SPELLS -->
<script src="//ajax.googleapis.com/ajax/libs/angularjs/1.2.16/angular.min.js"></script><!-- load angular -->
<script src="core.js"></script>

</head>
<!-- SET THE CONTROLLER AND GET ALL TODOS WITH INITIALIZE FUNCTION -->
<body ng-controller="mainController">
<div class="container">

<!-- HEADER AND TODO COUNT -->
<div class="jumbotron text-center">
<h1>NEAN Loves tasks <span class="label label-info">{{ todos.length }}</span></h1>
</div>

<!-- TODO LIST -->
<div id="todo-list" class="row">
<div class="col-sm-4 col-sm-offset-4">

<!-- LOOP OVER THE TODOS IN $scope.todos -->
<div class="checkbox" ng-repeat="todo in todos">
<label>
<input type="checkbox" ng-click="deleteTodo(todo._id)"> {{ todo.text }}
</label>
</div>

</div>
</div>

<!-- FORM TO CREATE TODOS -->
<div id="todo-form" class="row">
<div class="col-sm-8 col-sm-offset-2 text-center">
<form>
<div class="form-group">

<!-- BIND THIS VALUE TO formData.text IN ANGULAR -->
<input type="text" class="form-control input-lg text-center" placeholder="Add tasks to me here" ng-model="formData.text">
</div>

<!-- createToDo() WILL CREATE NEW TODOS -->
<button type="submit" class="btn btn-primary btn-lg" ng-click="createTodo()">Add</button>
</form>
</div>
</div>

 

</div>

</body>
</html>

Using Nitrous.io | tmux

Action.io and now Nitrous.io is a super awesome tool for developers on go or collaboration in remote places. Reasons why I use it :

  • Web IDE is very robust. Work even on mobile. I can control every data just as a regular Linux screen in my web browser.
  • Developers can collaborate across countries.
  • IRC style chat between collaborators that can help you track and discuss issues while debugging.
  • Test case support using Jasmine etc. All other services like Mongo etc are powered by Auto-parts.
  •  I can run a web app on a free link and also change it to custom domain very easily. Its just 20$/month of awesomeness.
  • Session for a specific app can be maintained by tmux.

Bad :

  • Nodemon and Grunt don’t work even if you do npm install.
  • Sometimes I get SSH errors while using remote SSH client.

# Below I have written some useful commands to start a session as if you start a node server and close terminal/IDE it would stop.

tmux new -s my_session


// Create a Session

$ cd workspace

$ git clone repo

$ cd repo

$ tmux new -s repo_session

$ repo_session : node app.js

// List sessions

$tmux ls

 

Although its great for running your app while in development but for hardcore deployment try thinking about Heroku  and Nodejistu

Confessions of a JS Developer

I wrote a lot lines in python back in early era but the new love of Node is like so passionately motivating that makes one motivated to not go back to old school. I have been learning/hacking Node/Angular for like few months now and few confessions that I would want to share (if anyone cares to read!, Ne-ways!) :

  • When the world has “ExpressJS” working with HTML why people use Jade or Ejs. I know that you want to speed the loading but its a brick-wall while scaling up the applications. Developers go nuts when they see jade.Please change this automatically in Express app creation from cmd. We love HTML.
  • MEAN.IO and Synth both frameworks are super worth trying. They use Express as its community friendly. Go try them.
  • Never use handlebars/ember.js if you want to use Angular. Both of them works awesomely for Semantic templates but only Angular is like also taking care of your Front-end MVC that is strikingly similar to Express in many ways.
  • If you are thinking to save time to create a REST API for your Angular Hungry Controller try looking at RESTIFY or See this post.
  • If you are in data driven space, Full stack JS works awesomely fast.
  • Bootstrap works awesome with Angular than Foundation. Try seeing the UI Directives
  • MongoDB for App data and Redis for session data/Connect-mongo for Session works magically sweet. Try using Mongoose.

 

 

Git Track Untracked files


//You have made changes to the code and want to push to the remote repository

$git add .

$git status

// Now you see that some files have not been tracked

$git update-index --no-assume-unchanged /path/filename.filetype

//Error : The file couldn't be marked. To solve this issue, you can use

$git add filename.filetype

$git push _remote _branch

//Although on windows you would find issues like file path is more than 256 char and cannot be added "filename to long". for them add explicitly works.

Install MongoDB using cmd

I know its super easy to install Mongo Daemon on Ubuntu but its not that tough to run it as a service in Windows.


$Wget http://fastdl.mongodb.org/win32/mongodb-win32-i386-2.4.9.zip

//Extract the folder to /path/MongoDB

//Add /path/MongoDB/bin to the Environment variable "path"

//Create a mongo.config at /path/mongodb/mongo.config
# mongodb.conf

# data lives here
dbpath=\path\mongo\data

# where to log
logpath=\path\mongo\log\mongo.log
logappend=true

# only run on localhost for development
bind_ip = 127.0.0.1

port = 27017
rest = true
//End of the Config

//Now all set just Start your mongo daemon

$ mongod.exe --config \path\mongodb\mongo.config

$ \path\mongodb\bin>mongod --config \path\mongodb\mongo.config

//Now its time to connect to your MongoDB

$cd /path/mongodb/bin

$ mongo

//Install Mongo as Windows service
$\path\mongodb\bin mongod --config \path\mongodb\mongo.config --install

// To start the Mongo service :

$ net start MongoDB

// To stop the Mongo service :

$ net stop MongoDB

Why use template language like jade ?

If you are a developing express based NodeJs applications, Jade is the new kid in town. If you thought, it would be simple HTML converted then take  look again.

You can find the very first jade in a express application.


$ npm install -g express

$ express -c stylus App_name

$ cd App_name

$ npm install

$ cd views

// Index.jade

//Layout.jade

$ cd ..

$ nodemon app

// You can see the template representation in browser as regular HTML under page source.

Before defining the syntax basics of Jade, it is important to understand the need of a template engine :

Some benefits of Jade specifically are:

  • It’s a templating language for html, so it makes writing html less verbose and “easier” (once you’ve learned it’s syntax).
  • It supports template inheritance (which is awesome and super useful)
  • You can compile templates into re-usable functions that can be run on the server or client, passing in different data-sets and rendering them on demand (can be super nice for single-page applications).
  • You do not need a sophisticated editor to find unmatched closing tags: There are no closing tags!
  • No code formater required: indentation reflects nesting
  • You can use markdown for readable markup
  • Mixins are powerful but easy to read
  • I cannot live without block append/prepend It is useful for a template hierarchy.
  • Works perfect in combination with Angular the superheroic Javascript MVW framework
  • You can reuse jade templates

 

Convert HTML to jade :


$npm install -g html2jade

$html2jade website_address > website_name.jade

// You can use the online version - http://html2jade.aaron-powell.com/

 

The best source for Jade documentation : http://naltatis.github.io/jade-syntax-docs/

 

 

Restart node process using nodemon


$ npm install -g nodemon

$ cd App_Folder

$ nodemon app.js

//App.js is the file that uses to perform node related services.

It would behave like : During your development phase it would watch your source file and if their is any change it would restart the process for the node.

Features :

(1) Automatic restart of the application

(2) Detect specific files

(3) Ignore specific files

 

Redis vs MongoDB

if you want to build realtime applications then you need to write async functions in node that support in-mem data stores like redis. Although, mostly people are confused on why do we even need redis when we are using something like mongodb ?

The fundamentals can be broken down into :

  • Data Model :

MongoDB

Document oriented, JSON-like. Each document has unique key within a collection. Documents are heterogenous.

Redis

Key-value, values are:

  1. Lists of strings
  2. Sets of strings (collections of non-repeating unsorted elements)
  3. Sorted sets of strings (collections of non-repeating elements ordered by a floating-point number called score)
  4. Hashes where keys are strings and values are either strings or integers.
  • Storage 

MongoDB

Disk, memory-mapped files, index should fit in RAM.

Redis

Typically in-memory.

  • Querying

MongoDB

By key, by any value in document (indexing possible), Map/Reduce.

Redis

By key.

—————

Both can be used for good results (Craig-list uses it).

MongoDB is interesting for persistent, document oriented, data indexed in various ways. Redis is more interesting for volatile data, or latency sensitive semi-persistent data.

  • Redis can be used for user sessions and MongoDB can be used for user data.
  • Redis can be used for advanced features (low latency, item expiration, queues, pub/sub, atomic blocks, etc …) on top of MongoDB.

#Please note you should never run a Redis and MongoDB server on the same machine. MongoDB memory is designed to be swapped out, Redis is not. If MongoDB triggers some swapping activity, the performance of Redis will be catastrophic. They should be isolated on different nodes.

On a higher level :

For use-cases:

  • Redis is often used as a caching layer or shared whiteboard for distributed computation.
  • MongoDB is often used as a swap-out replacement for traditional SQL databases.

Technically:

  • Redis is an in-memory db with disk persistence (the whole db needs to fit in RAM).
  • MongoDB is a disk-backed db which only needs enough RAM for the indexes.

There is some overlap, but it is extremely common to use both. Here’s why:

  • MongoDB can store more data cheaper.
  • Redis is faster for the entire dataset.
  • MongoDB’s culture is “store it all, figure out access patterns later”
  • Redis’s culture is “carefully consider how you’ll access data, then store”
  • Both have open source tools that depend on them, many of which are used together.

NoSql movement

Characteristics :

  1. Non relational
  2. Open-source
  3. Cluster friendly
  4.  Schema-less  : This changes the table for developing the application with relational databases. This improves flexibility in someway. You can add unstructured data in a NoSQL data-store.
  5. No Joints

Data-models for NoSQL

They use different data model of NoSQL databases (4 Chunks):

  • Key/Value data model - You have a key and asks it to grab a value linked to the key. The database knows nothing about that the “value” of the value store. This allows you to save metadata and improved indexes on metadata values. This can be a hash map but is persistent in the disk. They have no set schema
  • Document data model – The data is saved in a complex structure as a document .The best usage is JSON based structured database. They have no set schema. We can query inside this document structure. Their is an ID for indexing.

# The difference b/w last two models is a bit hazy. We can call them aggregate oriented database i.e all the data store has all the data in it without any set schema. In reality, the difference between them don’t matter that much.

# In relational DB we save the aggregate in terms of many tables as it has set schema, without the schema we cannot add a value to the database. In NoSQL we can save the whole complex structure as a data object. In relational we have many aggregate (ex: line item) that asks a object ( Order) i.e a whole unit in it self. Now in NoSQL we are  just moving aggregate i.e Value in Key/value store and document in document data store. Conclusion – we have more flexibility while scaling the application layer.

  • Column family data model – We have a row key that can store multiple column key and column value. It gives you advantage to pull more information from a data query.  This is also schema less.

#Aggregate oriented data model is useful if you want to give and take same aggregate again and again. Its not very useful if you want to slice and dice the database (better use relational database)

  • Graph databases – The notable examples are Neo4J. They break a data into many components and handle them very carefully. This is very different from all the three aggregated oriented databases. This is also schema less. It has a awesome query language.

 

NoSQL consistency 

RDBMS == ACID | NoSQL == BASE

# atomicity, consistency, integrity, and durability

#ACID is consistency and people don’t believe that NoSQL is consistent.

Problem : Suppose you have a single unit of information and when you wrote half the data,someone else reads it and vice versa. This would mess things up! We need acid updates to solve this transnational issues.

Solution : Graph databases do use ACID. Aggregated -oriented database don’t actually require ACID. Keep the transactions/ACID in a aggregate limitations i.e any aggregate update is ACID in nature.

Problem : Two users for same app is connecting to front-end   to change values of a data store. if they do it at the same time, how would it work ? Since if we allow changes in same time for same piece of information  – we would be having issues of maintaining consistency .

Solution : in Relational we have transaction that is typically queued for every user. It solves consistency but is not solution for all the systems.  We can have “offline Lock” i.e give each aggregate data a version stamp and when user one pushes updates, user two when finishes can be used to solve the inconsistency. ACID transactions are not the same in NOSQL.

Types of Consistency  :

  1. Logical – Sharding(use one piece of data and put on multiple nodes i.e breakdown).
  2. Replication – Replicate the same data object among multiple nodes. Now you have more data objects to solve this consistency issue in case of node failure.

Problem : user A & B want to book a hotel room. Both are geographically varied. The system has to decide who to give the ticket. Imagine if the communication between two nodes (Country one node & other country node) are down. In this case the system may not be connected hence booking can be made on both the sides creating confusion and issues in real world. How to solve this consistency problem :

Solution : One solution is no bookings until connection is up and other is going even though line is up. So the inconsistency can be solved by business logic in case of choice two. DynamoDB wanted shopping cart to be always live and had many business issues.  So the solution to manage these inconsistency is by business logic.

Eventual consistency is a consistency model used in distributed computing that informally guarantees that, if no new updates are made to a given data item, eventually all accesses to that item will return the last updated value.

A quorum is the minimum number of votes that a distributed transaction has to obtain in order to be allowed to perform an operation in a distributed system. A quorum-based technique is implemented to enforce consistent operation in a distributed system

RYW (Read-Your-Writes) consistency is achieved when the system guarantees that, once a record has been updated, any attempt to read the record will return the updated value.

A conventional relational DBMS will almost always feature RYW consistency. Some NoSQL systems feature tunable consistency, in which — depending on your settings — RYW consistency may or many not be assured.

The core ideas of RYW consistency, as implemented in various NoSQL systems, are:

  • Let N = the number of copies of each record distributed across nodes of a parallel system.
  • Let W = the number of nodes that must successfully acknowledge a write  for it to be successfully committed. By definition, W <= N.
  • Let R = the number of nodes that must send back the same value of a unit of data for it to be accepted as read by the system. By definition, R <= N.
  • The greater N-R and N-W are, the more node or network failures you can typically tolerate without blocking work.
  • As long as R + W > N, you are assured of RYW consistency.

Example: Let N = 3, W = 2, and R = 2. Suppose you write a record successfully to at least two nodes out of three. Further suppose that you then poll all three of the nodes. Then the only way you can get two values that agree with each other is if at least one of them — and hence both — return the value that was correctly and successfully written to at least two nodes in the first place.

In a conventional parallel DBMS, N = R = W, which is to say N-R = N-W = 0. Thus, a single hardware failure causes data operations to fail too. For some applications — e.g., highly parallel OLTP web apps — that kind of fragility is deemed unacceptable.

On the other hand, if W< N, it is possible to construct edge cases in which two or more consecutive failures cause incorrect data values to actually be returned. So you want to clean up any discrepancies quickly and bring the system back to a consistent state. That is where the idea of eventual consistency comes in, although you definitely can — and in some famous NoSQL implementations actually do — have eventual consistency in a system that is not RYW consistent.

When to use NoSQL ? 

Its a self perception but some main drivers are :

  1. If you have larger data and or is unstructured. Easy to query and program.
  2. People want to program easily for natural aggregate data objects.
  3. People use them for Agile analytics opposite to data warehousing concept. Most people use Graph databases for it.

 

Best practices before writing Angular MVC

  • Use Code Generator as starting point  : Yeoman and Angular-seed are popular stuff
  • In your web document place java-script code at the end to improve page loading time.
  • use ng-cloak {{data_dependency_exp}} i.e When the browser loads some client might see {{value}} so its good to use ng-cloak directive. Other solution is to use ng-bind that would let you write value instead of {{value}}. The behavior of value is not shown untill its loaded.
  • Don’t use Minfication of your angular self-code. Things would get messy as angular has tough time understanding.
  • Separate Presentation and Business logic is very important. Never use Angular for server side code.
  • Structuring business logic :

# Controllers
1. Don’t reference DOM elements
2. Should focus on View behavior i.e What should happen if user does X and where I get X from?
3. New instance per view aka not singletons.

# Services
1. Shouldn’t reference DOM (mostly)
2. Have logic independent of the View aka Do X operation.
3. Singletons aka happening during the lifetime of the application.

  • DOM should always be manipulated from directives.
  • $scope is glue between View and Controller. Its should be read-only in View and Controller should be write-only. This means $scope of reference to the models.
  • Whenever using bidirectional binding (ex parent and child process) make sure you don’t bind directly to scope properties else child process would not work as it should. Yes, child process is something to be careful in front-end MVC.
  • Structuring modules :
  1. Multiple modules allows you to make more than one modules for taking advantage of third party applications.
  2. So should be one module per third party reusable library.
  3. Or can be one module per test (different levels)
  4. Or can be one module per view.
  • Deployment tips :
  • Minify your JS, makes it load faster but try to do it in production not in development phase.
  • gzip enable your server
  • index.html should be non-cachable. Always cache by versions (Views, code, Images and CSS)

Thats all! Enjoy coding!

Install NodeJS from Source in Ubuntu instance on AWS

Since the repositories from NodejS in Ubuntu are very old so its best to install from source.


$ sudo apt-get install git

$sudo apt-get install build-essential

$sudo apt-get install wget

$ wget source_nodejs_path_from_website

$ tar xvf file_name_nodejs.tar.gz

$ pushd folder_extracted

$ ./configure

$ make

$ sudo make install

$ node --version

$ node

$ 1+1

$ 2   //Node is working

$ sudo npm install -g served       //server daeoman for NodeJS

$ serverd 3000

// Visit localhost:3000 and its live !

If Issues and want to clean up before reinstalling


$ rm -rf  ~/node_modules/

$ find . -type d -name node_modules -exec rm -rf {} \;

$ mv ~/.npmrc ~/.nmprc.bak

$ sudo rm -rf /usr/local/*/node*

$ sudo rm -rf /usr/local/*/npm*

$ sudo rm -rf /usr/*/node*

$ sudo rm -rf /usr/*/npm*



M.E.A.N IO

MEAN.IO is very impressive. It gives a developer  fresh vision to start developing right out of the box on a application. If you are a fan of SPA (Single Page Apps) then you will fall in love with it. NodeJS is very fast and so as popular for Realtime SPA that might be used in projects like Web-Sockets powered chat system, Live update of content etc.

Getting started 

MEAN.IO contains the most popular JS frameworks :

– AngularJS – FrontendMVC
– NodeJS – Server-side MVC
– ExpressJS – NodeJS framework (inspired by Sintara Ruby aka DSL)
– MongoDB – NoSQL data store

Other tools :

– Grunt – Javascript task runner
– npm – package manager for NodeJS
– Passport – Authentication for NodeJS
– bower – Front-end package manager (for Angular)
– Mongoose – Query for MongoDB


#Open your terminal and run the following commands

$ sudo apt-get install grunt-cli

$ sudo apt-get install wget

$ wgethttp://robomongo.org/files/linux/robomongo-0.8.3-i386.deb

$ tar -xvf robomongo-0.8.3-i386.deb

# I assume that you have already installed NodeJS

$ git clone https://github.com/linnovate/mean.git

$ cd mean

$ npm install    (This would install dependencies)

$ bower install

$ grunt

# Open your browser and you can find it on localhost:3000 or 3001.

Since now mean is installed and running. What to expect ?

  • You can start developing amazing front-end using AngularJS. We would talk about this later in the article.
  • You can write powerful MVC applications as express application backed on NodeJS. Express is a framework for NodeJS and is bundled with Jade the template engine. Since ts java-script, few things need to be kept in mind on how they are different from C++ :
  1. In C++ or C#, when we are talking about objects,we’re referring to instances of classes or structs.Objects have different properties and methods depending upon classes. 
  2. In case of Javascript, Objects are collections of name/value pairs – think of a JS object as a dictionary with key strings.

Key/Value pair is something very important to understand as the essential need of the Stack JS applications.

  • MongoDb – The NoSQL data-store that is extremely popular. Every data is stored as key/value pair to fit on the NodeJS requirements but its disk documentation store.

Thats all!

  • Wait, how to make SPA’s actually realtime ? We are missing something like Redis thats a data structure server for In-memory cache. This is the far neighbor of Memcached.

$ sudo apt-get install -y python-software-properties

$ sudo add-apt-repository -y ppa:rwky/redis

$ sudo apt-get update

$ sudo apt-get install -y redis-server

 

Why is it amazing ?

Node.js is a javascript motor for the server side. In addition to all the js capabilities, it includes networking capabilities (like HTTP), and access to the file system. This is different from client-side js where the networking tasks are monopolized by the browser, and access to the file system is forbidden for security reasons.

Node.js as a web server: Express

Something that runs in the server, understands HTTP and can access files sounds like a web server. But it isn’t one. To make node.js behave like a web server one has to program it: handle the incoming HTTP requests and provide the appropriate responses. This is what Express does: it’s the implementation of a web server in js. Thus, implementing a web site is like configuring Express routes, and programming the site’s specific features.

#Middleware and Connect

Serving pages involves a number of tasks. Many of those tasks are well known and very common, so node’s Connect module (one of the many modules available to run under node) implements those tasks.

Concerns that NodeJS Solves

[ Real-time systems are very powerful from the user's point of view, but they also introduce a number of difficult technical questions/problems such as ]

  • How can you maintain a single realtime session across multiple browser tabs such that changes which a user makes in one tab will immediately be reflected in a different tab?
  • What should you do if a client loses its connection with the server? When and how should you reconnect the client?
  • How to deal with a worker crash? How to maintain in-memory session data after a worker reboots?
  • When is it safe to discard stale session data?
  • What happens when a worker fails to clean up after itself?
  • What if a malicious user tries to hog up resources by keeping thousands of sockets open simultaneously?
  • How to protect your system from DOS attacks over a stateful channel?
  • How can you allow two users connected to different workers to communicate with one another across process boundaries (or host boundaries)?

 

Angular JS is awesome !

  • Very awesome Single page Applications ( Different Views can be synced in real-time and SPA responds to it).
  • Data binding, MVC, Routing, testing, JQlite, templates, history,factories, View model, Controllers, Views, Directives, Services, Dependency injection and Validation.

 

AngularJS can do powerful stuff like routing and interacting with data then why we use of Express and NodeJS ?

The correct answer is :

  • There are things which should be done server side (i.e. Express, not Angular), most notably user input validation – Angular, as it’s client side, can be tampered.
  • Also, if you’ll ever want to offer access type other than web app (i.e. mobile app), you’ll probably need an API anyway – Express can do this, Angular don’t.
  • Finally, database access – usually Angular app will need to connect to some kind of backend to perform CRUD operations. You’ll either go with hosted DB like Firebase, or you’ll end up using your own database. Latter scenario is more popular and you’ll need Express (or similar) for that.
  • An web app is not just some html pages linked together. There are lot of other things that needs to be implemented
  1. Model validation.
  2. Keeping model consistent. Remember multiple users can access the same model at any give time and even change it.
  3. Controlling resource access.
  4. Triggering workflows.
  5. Business Logic.

SentAnalysis-py : Code & Screenshots

Last week was crazy ! I have been coding for a kernel module for the simulation project and results went a bit unexpected. I rather tried to explore sentiment analysis for a friend that I was helping in his presentations. Its not that complicated after-all for basic thoughts as a skill-up exercise.

You can use the code after :

  1. You have a output.json {contains tweets from streaming API}. Download a sample version here.
  2. Should know that the syntax is in Python 2.7 so wont work with 3.X. Also I am using AFINN and soon would be using Wordnet {In a complicated way}.
  3. Run it after you have all the required import libraries including Json and oauth2.
  4. Run it as $python sentiments.py . I have added few screenshots.

# Sentiments.py


import sys
import json
import re

def hw(sent_file,tweet_file):
sent_dict = {}

for line in sent_file:
line_list = line.split()
if len(line_list) > 2:
length = len(line_list)
temp_line_list = []
temp_line_list.append(" ".join(line_list[:length-1]))
temp_line_list.append(line_list[length-1])
line_list = temp_line_list
sent_dict[line_list[0]] = float(line_list[1])

for line in tweet_file:
## print "a new tweet"
dict = json.loads(line)
sum = 0;
if 'text' in dict.keys():
text = dict['text']
## print text.encode('utf-8')
words = text.split()

for word in words:
word = re.sub('[^0-9a-zA-Z]+', '', word)
sum += sent_dict.get(word, 0)
print sum
def lines(fp):
print str(len(fp.readlines()))

def main():
sent_file = open(sys.argv[1])
tweet_file = open(sys.argv[2])
hw(sent_file,tweet_file)
lines(sent_file)
lines(tweet_file)

if __name__ == '__main__':
main()

Sentiments of the Tweet

Lastly, the plot can be done using matplotlib but I used Google charts for fast depiction.

Plot

 

 

 

 

 

 

Powerful and Essential Programming languages : Data Visualization Geeks

Data revolution would reach peak and this makes more interesting for the developers to create useful data visualizations that show the correct meaning and hence try to convince the business logic on the governing scenario. I am a geek and I like to program using opensource technologies that are interesting and useful.

Here is the list of interesting programming languages (most created over JavaScript), libs and tool-kits that passionate developers can adopt and start exploring :

D3

D3

Data driven documents is very popular and lets you create dynamic visualizations. It lets you manipulate documents using data.

Check documentation, tutorials, examples, Obama campaign  

 

protovisProtovis

This composes custom views of data with simple marks such as bars and dots. This is no longer in active development as of June 2011, but it has good examples for understanding the visualization approach.  Its predecessor of D3.

Check documentation, examplesEagerEyes

 

VEGAVega

It is visualization grammar, a declarative format for creating, saving and sharing visualization designs.

 

 

Check documentation,Tutorial

 

ProcessingjsProcessing.js

You write code using the Processing language, include it in a web page, and Processing.js does the rest. This script framework can help you develop a impressive visualizations using web standards. Its backed by processing that is in existence since 2001.

 

Check documentation,Tutorial, example

 

 quadigramQuadrigram

With Quadrigram you can create custom data visualizations in an intuitive way with the flexibility of a visual programming language. It enables you to prototype and share your ideas rapidly, as well as produce compelling solutions with your data in the forms of interactive visualizations, animations or dashboards.

Check documentation,Tutorialexample