Saturday, 29 December 2012

Hadoop installation procedure ..

Hadoop Architecture:

Hadoop is a powerful software for handling more petabytes of data.This work with clusters of computers.
Hadoop assigns the data to those systems which is in the cluster and shedules its jobs.This scheduling operation is performed by the "job tracker" in the hadoop architecture.

The task tracker will monitor the jobs that are performed by different systems in the cluster.
These 2 are comes under the Map Reduce layer.

The name node in the hadoop architecture is used for assigning the jobs to the computers that are available in the cluster.There is one secondary name node is used for assigning the jobs when the primary naming node is down.This will check the availability of the naming node each and every 5 seconds.

Requirements:
Oracle java 6 (jdk 1.6 )and above.
ubuntu 10.04
SSH
Installing Java in ubuntu:
# Add the Ferramosca Roberto's repository to your apt repositories
# See https://launchpad.net/~ferramroberto/
#
$ sudo apt-get install python-software-properties
$ sudo add-apt-repository ppa:ferramroberto/java

# Update the source list
$ sudo apt-get update

# Install Sun Java 6 JDK
$ sudo apt-get install sun-java6-jdk

# Select Sun's Java as the default on your machine.
# See 'sudo update-alternatives --config java' for more information.
#
$ sudo update-java-alternatives -s java-6-sun
The full JDK which will be placed in /usr/lib/jvm/java-6-sun 
$ sudo apt-get install sun-java6-jdk
is the installation command for the java jdk in ubuntu.
To check the installation type
user@ubuntu:~# java -version
To create a dedicated user to in the linux system for using hadoop the following commands are used.
This is for seperating other applications from the hadoop process(security,access rights ,etc.,)
$ sudo addgroup hadoop
$ sudo adduser --ingroup hadoop hduser

This will add the in the ubuntu machine then you need to log in to that account.

Then you need to configure the ssh .It will generate the private and public keys for the nodes to communicate.

The following commands are used to configure and generate the private and public key pairs for communication.

 

hduser@ubuntu:~$ ssh-keygen -t rsa -P ""
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hduser/.ssh/id_rsa):
Created directory '/home/hduser/.ssh'.
Your identification has been saved in /home/hduser/.ssh/id_rsa.
Your public key has been saved in /home/hduser/.ssh/id_rsa.pub.
The key fingerprint is:
9b:82:ea:58:b4:e0:35:d7:ff:19:66:a6:ef:ae:0e:d2 hduser@ubuntu
The key's randomart image is:
[...snipp...]

the command will generate the private and public key pairs.

 

Second, you have to enable SSH access to your local machine with this newly created key.

hduser@ubuntu:~$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

The final step is to test the SSH setup by connecting to your local machine with the hduser user. The step is also needed to save your local machine’s host key fingerprint to the hduser user’s known_hosts file. If you have any special SSH configuration for your local machine like a non-standard SSH port, you can define host-specific SSH options in $HOME/.ssh/config (see man ssh_config for more information).

hduser@ubuntu:~$ ssh localhost
The authenticity of host 'localhost (::1)' can't be established.
RSA key fingerprint is d7:87:25:47:ae:02:00:eb:1d:75:4f:bb:44:f9:36:26.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
Linux ubuntu 2.6.32-22-generic #33-Ubuntu SMP Wed Apr 28 13:27:30 UTC 2010 i686 GNU/Linux
Ubuntu 10.04 LTS
[...snipp...]
hduser@ubuntu:~$

 

 

Thursday, 27 December 2012

Hadoop The Power Of Elephant !

 

Have you ever handled or seen massive amount of data !! There is one open source software that handles massive(Big) data every day,Ya! That is hadoop -The elephant.

Hadoop is the open source framework for handling massive amount of data.It handles more peta bytes of data per day.

This Hadoop framework was introduced in the year of 1994 by Doug Cutting who worked at yahoo ! at that time.

He implemented the Google's Map Reduce paper.He named that work as Hadoop -The name of toy elephant of his son.

Hadoop uses the map reduce technique.

The work is distributed across multiple machines in  the cluster.The main difference between the Grid and the Hadoop is In grid computing the process are always running and the data is allocated to the process.

But in the hadoop the data is allocated and the process is started.

The data is distributed across multiple machines so there is one name node that used to keep track of the which data handled by which machine.This keeps track of the data that distributed across the machines.

The architecture of hadoop as follows

Hadoop cluster

The file system is HDFS- Hadoop data file system used in the Hadoop

It splits the data in to blocks and stores it in the different machines.The data is replicated so even in case of failures the data is available.

Applications of Hadoop

  • Log and/or clickstream analysis of various kinds
  • Marketing analytics
  • Machine learning and/or sophisticated data mining
  • Image processing
  • Processing of XML messages
  • Web crawling and/or text processing
  • General archiving, including of relational/tabular data, e.g. for compliance

Users of Hadoop around the Globe

TCS

CTS

Amazon

e-bay

Akamai

yahoo

Google

IBM

Microsoft

etc.,

Saturday, 1 December 2012

Converting Windows xp to Windows 8 in a Lightening Speed !!

STEPS TO CONVERT WINDOWS XP/WINDOWS 7 TO WINDOWS 8

STEP 1:

Download windows 8 Transformation pack from here.

It is completely freeware.

STEP 2:

Unpack that pacakage file and find the windows8.exe file,Double click on it.

It will open the installer.

One requirement is the .NET framework version4 or version 2 must be installed on your system.

If you dont have this Dont worry download from here ! which is also a freeware.

STEP 3:

Run the Installer and wait for some time.It will update the files which are needed.

Restart the system and complete the installation.

On the next start it will give the Windows 8 appearance.

If you didnt Got the windows 8 menu style.Download Rainmeter and its windows 8 menu setup and install it. 

That's all Windows8 without any cost EnJoY!!

Windows8