Topic-Specific Web Resource Crawling for Quality Controlling of Automated Search Engines – (UNDERGRADUATE RESEARCH PROJECT).

August 22, 2011 Leave a comment

Search Engines  had played a vital character in my life since I entered to the university as a engineering undergraduate.

After completing my  first year  in university I had realized that It is governed by “lecturing –>examine” not “Self learning-> Work Hard on real things” so in that time it was a  “POOOOVH”  on my dreams. I wanted to be some one that really working on advanced stuffs..think different than others..and simply not a student that know all the things in the books and know how to solve the differential equations but does not know how to use it to make something  which comfortable the human being.

So I knew the time had come to make a turning point.I had decided to learn software engineering subjects ,Computer languages, frameworks by myself because my subject list did not provide my requirements.Life was hard because of my decision.But I have the spirit to fight with it.On that quest Google was my friend, teacher…To find E-Books,Tutorials Demos, Bleeding edge of the technology,What are movements of the world economy ,how it affects to my future career,What’s up with the current job market,job requirements,International level qualifications much more…

That’s how I have make a incomparable friendship with Google.Working more than three years together I know how Google behaves with different kind of search methodologies.If you not aware of it please visit  Basic search help : Google you can find set of examples that describes how to use Google well. But still Google has very significant problems.

1. Result page full with junks

here is the example for it.. just imagine you wants find a e book on eclipse IDE plug-in development. So I am using “eclipse plugin eBook download” as my search query.


This is the Google’s top link for it.


Just because of having “Eclipse plugins free eBook at …” as a page content ,Google gives us the page with no relevant information of our search.How sad.?

2.Wasting the valuable time of the seeker.

3.Not identifying the exact requirements of the seeker.

Normally when we are searching we just use one or two key words.


It just pops up with the bunch of links in (0.16 seconds) ,But the thing is when we are searching we have list of thing we wanted to know but not entered in the search query.For an example above I was searching to buy a laptop so I wanted to know the prices,brands,performance. ..ext. .Those information should be there in the first 10 links Google gives us but if you aware enough of Google results you will be realized the quality is not there.

4.Give junk results repetitively even seeker expand the query length.

 As I mentioned above the junk results given by the Google will repetitively pops up if we change the search query length or the meaning.

1.Search using query as “Eclipse plugin eBook download free”.


Still the junk link is there.

2.Search using query as ”eclipse plugin development eBook download free”.


How sad Still the junk link is there.

I have already visited it and it was confirmed that that site does not contains the result that I want but still Google suggesting the result for me.Simple use of cookie it can be omitted those visited result.

So it is about my Friend Google.Now lets talk about us.Our undergraduate research project is to make a “Quality Controlling” mechanism of above results using “Topic-Specific Web Resource Crawling “. It was an idea from my friend and the navigational component on this quest Amith.

What is out Solution?

•AI based focused crawler automatically builds a web directory.

1.-AI based + Identified user expectations


3.-Eliminate junk results as much as possible

4.-Web Directory is updated more frequently

•System Architecture


•How we did it?

–Get information from human-edited directory – ODP (*Human Edited Web directory)

–Identified most frequent keywords in a particular category


•URL Finder

each time fetch combinations of keyword sets to Google and store the results.


•Web Crawler

crawl the resulted websites and count the related keyword frequency for the use of stat analyzer.

•Link Analyzer

Eliminate same results in the URL pool.

•Calculate Keywords p.u values of each websites



*In the above Graph (generated using system result submitted to MATLAB) shows the first result (top result) of the Google is poor in frequency of the key word that we are searching than the 2nd and 7th result key word frequency.So it is obvious that top results of the Google not the result that we are expecting as users.

  • Statistics Analyzer


*Above Screen shot shows the top ranked sites according to out system results for keyword “Diamond”.


– contribute more sources to find relevant keywords

– Human supervision is better

– use phrases than keywords

So It was privileged to contribute this stuff and this is one of the greatest things that I  have done in my university life other than my Undergraduate Project humiee.It was a incomparable experience that inspired myself.Because I always wanted to learn for the quest not for the marks.

Categories: Uncategorized

Speech Enabled Input Box

June 8, 2011 7 comments

I had dreamed about searching just imagine you want to download song and only thing you have the rhythm of it lets sey naaa na na naaa na naa naa na… that’s all.How do you find song?.Leave the exact one anything related to that sound? we can not do that with existing search engines.But today while I am watching the Google I/O 2011 day 2 key note team member of chrome browser show up a fancy icon with some extraordinary features.

*Note that this is only work for newest version of the Chrome browser so you have to install it to get this experince.



A text box that can grab the speech from microphone and show up.Smile just pops up my idea of music search and I wanted to know how they do that.Because it is a reaching approach of my crazy idea. So here begins the investigation.

1.Simply go to the Google Translate page.(I am using Google chrome).

So I wanted to find the tang responsible for this fancy has cool features like fire bug  right click the content and select


2.Then with careful look I found the tag responsible for our item.


3.Copy it and use in a my own html file.

  <TITLE>Test Input</TITLE>
  <SCRIPT LANGUAGE="JavaScript">
  function testResults (form) {
  var TestVar = form.inputbox.value;
  alert ("You typed: " + TestVar);
  function validate (form) {
  var TestVar = form.inputbox.value;
  alert ("You typed: " + TestVar);
  <FORM NAME="myform" ACTION="" METHOD="GET">Enter something in the box: <BR>
  <input id="gt-speech-in" name="inputbox"type="text" speech="speech" x-webkit-speech="x-webkit-speech" x-webkit-grammar="builtin:translate" size="10" lang="en" style=""><P>
  <INPUT TYPE="button" NAME="button" Value="Click" onClick="testResults(this.form)">

This is the out put I have got.I try to add that html to this post unfortunately word press editor edit it its own way so the features were gone just after adding that.So only I can show you the out put image but you can try it by using above code.



Try it out .better web with cool features.

Categories: HTML Tags: ,

How to Connect MySQL with Python

June 8, 2011 Leave a comment


Since today morning I was trying to get the database connection for my testing python program.But it was very time consuming thing for me to find a proper python support module to connect the DB and after some times I was wondering that such a popular programming language dose not have a fast and easy connecting module for the popular DB like MySQL.

Then I ask my friendly and kind teacher Google Smile how to connect MySQL in python.Oh my kind teacher how sweet he is an how poor his page rank Algorithm he gives me a bunch of links but unfortunately first 5 links can not do the job for me.Smile .Then I change the query 2 or 3 times I found a link  I and got the module installed(for python 2.7 windows 32 bit version) in my W7 but the import statement on the eclipse python editor still got red mark.Sad smile.

Never give up that’s the sprite of a Software Engineer.Smile.So do I.Return to Google my poor friend.After some time I have found link call Unofficial Windows Binaries for Python Extension Packages .

That is why they say “Never give up” thanks to the maintainers of that super bunch of links.I got the MySQL-python-1.2.3.win32-py2.7.‌exe  [1.0 MB]  [Python 2.7]  [32 bit]  [Sep 01, 2010] I works for me.

This is the script for connecting MySQL with python I have finally done it.Smile.

Created on Jun 8, 2011

@author: kalpa
import MySQLdb

class MyClass(object):

    def __init__(self):

        self.conn = MySQLdb.connect (host = "localhost",user = "root",passwd = "",db = "tankmap")
        cursor = self.conn.cursor ()
        cursor.execute ("SELECT tankName from tanks")
        names = [f[0] for f in cursor.description]
        for row in cursor.fetchall():
            for pair in zip(names, row):
                print ('%s: %s' % pair)
        cursor.close ()
        self.conn.close ()

foo = MyClass()

I you want to test the code you can download the sql script I had use for this.from here.


Categories: python Tags: ,

Ajax and Beyond..

June 6, 2011 Leave a comment

After emerging the Ajax (Asynchronous JavaScript and XML) a dream of web developers came true.Because of it can harvest the server time in a different way.Simply we can explain as below


Above image show the traditional request/response way of server client communication.Each time a server request is made, the page must refresh to reveal new content.


Internet request/response model using Ajax’s asynchronous methodology.Multiple
server requests can be made from the page without need for a further page refresh.

Ajax makes the internet gear in to OD(Over drive).As the first few who use Ajax in a enterprise level web applications Gmail and flicker have taken big steps ahead even now the mail experience of the Gmail is very rich one(Every one has Gmail account today and you know how it is going on).

With the grate acceleration of the Ajax web breaks its traditional barrier of the response time (speed) new bunch of possibilities emerged.Developers think about desktop like applications on the web!!.On the other hand web standers are getting rich (CSS,HTML 5) ext.. Them it comes the word RIA (Rich internet Applications)

Rich Clients are also can be divided in to several parts due to various constrains. Below image will categorize that also.


So there are several Rich Client Makers

1. Adobe Flex

2. Microsoft Silverlight

3. Java FX

4. Open Laszlo

5. GWT (Google Web Toolkit)

There are few others but these are the big guys.

I will discuss about above Rich Client Makers technologies in this article series not only the details but also how to use them ,what are the tools advantages and disadvantages of those ,current status of those according to my experience with them. And Specially I will show how to develop Rich Client applications with GWT technology with code snippets.This article is a first of that series.


Categories: Ajax, Rich Client Applications Tags: , ,

How to Make Google map application using PHP/Mysql

June 6, 2011 5 comments


This is my second post.Since first one in October 2010 I was very busy with my academic and Internship stuffs.So to day I will show you step by step how to develop a  Google map application using PHP/MYSQL.

Fist of all you must get a key for using Google maps so go to and follow the steps then you will have a key for using Google maps.Please read the terms and conditions for using that.Smile .


To follow this tutorial you must have to have these tools below .I will tell for  both windows and Linux users how to sets up the infrastructure.

For Linux Users.

1.Just Google “How to install LAMP”

-> you will have thousands of links how to do that.I am not going to tell how to do that here.Smile

For Windows users

1.Download XMAPP from here

2.install it on your computer.There will be a nice GUI so just start the MySQL server and Apache Server.

Now we have the infrastructure to run the PHP codes and powered it by the MySQL database connection.Smile

(Both linux and windows users   Linux users have to start Apache and MySQL manually On the Terminal)

Now the Second step.For this step we should have some IDE because it is convenient when we go in to further when code become complex it will not easy to find error or fault when we use direct development scenario.However I know most of Linux users like vim and on the other hand windows users notepad for this kind of stuff its up to you.But for the easiness of me and beginners who will see this I am using Net beans IDE for this purpose.

If you do not have it download it from here.

After install it.Now you have the full infrastructure to develop our application.Smile

(Please check weather your apache server and MySQL server running properly you can do it easly by googling one query to how to do so Smile)

open Netbeans you will have screen like below.



Go to File->New Project

Then a wizard will appear and find the PHP item on the right panel and select it.It will look like below.



Select PHP Application from the right panel and press next…

Then give the details for the next wizard page it will ask the project name –> give it as “map”

and the most important thing is set the project location in the right place in the relevant server in XAMPP it is “ XAMPP\xampp\htdocs”  in WAMP or LAMP  it is “www” folder.

see the below screen shot if any droughts post it on the comment section.Smile.



Press Next –> you can go to the next wizard page section it will ask the running configurations for this application.It will look like below and leave the default settings but do not do things blindly read those fields and what are they up to Smile.

Press the finish button and then you will have a newly created PHP project with index.php file.


hhmmh…tired not give up I tell you hoe to do this.



This is how tit will looks like.

if you want to cheek your settings juzt put your name in to the auto generated php code Winking smile

echo "Your name";

now save the content and rigkt clik the project and select run if every thing is ok your browswe will apper and show your name just like below.



Now tis time to create out Google map application.

As the first step we should create a Data base.It is very simple I will give you the sql for create the data base for this application.Smile

For that  type this Url in your browser address area. http://localhost/phpmyadmin/

create a data base with name “tankmap”


Download tankmap.sql file from here.

After creating the data base go to the “Import” tab in the PhPMyAdmin dash board..


now locate the tankmap.sql file in your file system and make sure to select the radio button “SQL” from “Format of Imported files Section”. Smile you have the database that we want develop this application. Open-mouthed smile

Lest begun the code….

1.Right click the Source Files Filder on the Project go to New->PHP file



give the name as “dbinfo” copy past the below content to it.Smile


It is to hold the database related details ..

now the most interesting part is how we going to consume the database we created in our map??

Surprised smile

PHP is for that….Winking smile   in this tutorial I have developed a PHP file that can make XML file from reading the data base..OK OK I know things getting bad..back to code then I will explain every thing it a promise.Smile

2.Crate a new PHP file call xml_gen.php

add these content to it I have add comments for the code and I will explain those below.Smile



// Start XML file, create parent node

$dom = new DOMDocument("1.0");
$node = $dom->createElement("tanks");
$parnode = $dom->appendChild($node);

// Opens a connection to a MySQL server

$connection=mysql_connect (localhost, $username, $password);
if (!$connection) {  die('Not connected : ' . mysql_error());}

// Set the active MySQL database

$db_selected = mysql_select_db($database, $connection);
if (!$db_selected) {
    die ('Can\'t use db : ' . mysql_error());

// Select all the rows in the tanks table

$query = "SELECT * FROM tanks WHERE 1";
$result = mysql_query($query);
if (!$result) {
    die('Invalid query: ' . mysql_error());

header("Content-type: text/xml");

// Iterate through the rows, adding XML nodes for each

while ($row = @mysql_fetch_assoc($result)){
    $node = $dom->createElement("tank");
    $newnode = $parnode->appendChild($node);
    $newnode->setAttribute("lat", $row['latitude']);
    $newnode->setAttribute("lng", $row['longitude']);
    $newnode->setAttribute("damtype", $row['damtype']);
    $newnode->setAttribute("damheight", $row['damheight']);
    $newnode->setAttribute("nofCanals", $row['nofCanals']);
    $newnode->setAttribute("capacity", $row['capacity']);
    $newnode->setAttribute("presentCondition", $row['presentCondition']);
    $newnode->setAttribute("maintainAuthority", $row['maintainAuthority']);
    $newnode->setAttribute("surface", $row['surface']);
    $newnode->setAttribute("catchment", $row['catchment']);
    $newnode->setAttribute("purpose", $row['purpose']);
    $newnode->setAttribute("nofFamerFamilies", $row['nofFamerFamilies']);
    $newnode->setAttribute("nearestTown", $row['nearestTown']);
    $newnode->setAttribute("rainfallData", $row['rainfallData']);
    $newnode->setAttribute("image", $row['image']);

echo $dom->saveXML();


You must see the out put of this php file .You can see it using IE past the url as http://localhost/map/xml_gen.php then you will have a generated xml content like this.


here goes the explanation Smile






3.Now its time to create the presentaion layer of this application and use the Cooool Google Map Api …

create a new HTML file and copy this content on to it…

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "">
<html xmlns="">
        <meta http-equiv="content-type" content="text/html; charset=utf-8"/>
        <script src=""
        <link type="text/css" href="css/style.css" rel="stylesheet" media="all" />
        <script type="text/javascript">

            var iconBlue = new GIcon();
            iconBlue.image = '';
            iconBlue.shadow = '';
            iconBlue.iconSize = new GSize(12, 20);
            iconBlue.shadowSize = new GSize(22, 20);
            iconBlue.iconAnchor = new GPoint(6, 20);
            iconBlue.infoWindowAnchor = new GPoint(5, 1);

            function load() {
                if (GBrowserIsCompatible()) {
                    // var map = new GMap2(document.getElementById("map"));
                    var map = new google.maps.Map(document.getElementById('map'));
                    map.addControl(new GSmallMapControl());
                    map.addControl(new GMapTypeControl());
                    map.setCenter(new GLatLng(6.082083, 80.542277), 13);

                    GDownloadUrl("xml_gen.php", function(data) {
                        var xml = GXml.parse(data);
                        var markers = xml.documentElement.getElementsByTagName("tank");
                        for (var i = 0; i < markers.length; i++) {
                            var name = markers[i].getAttribute("tankname");
                            var damtype = markers[i].getAttribute("damtype");
                            var point = new GLatLng(parseFloat(markers[i].getAttribute("lat")),
                            var damheight = markers[i].getAttribute("damheight");
                            var nofCanals = markers[i].getAttribute("nofCanals");
                            var capacity = markers[i].getAttribute("capacity");
                            var presentCondition = markers[i].getAttribute("presentCondition");
                            var maintainAuthority = markers[i].getAttribute("maintainAuthority");
                            var surface = markers[i].getAttribute("surface");
                            var catchment = markers[i].getAttribute("catchment");
                            var purpose = markers[i].getAttribute("purpose");
                            var nofFamerFamilies = markers[i].getAttribute("nofFamerFamilies");
                            var nearestTown = markers[i].getAttribute("nearestTown");
                            var rainfallData = markers[i].getAttribute("rainfallData");
                            var imagePath=markers[i].getAttribute("image");
                            if(imagePath == ""){
                                imagePath = "Image Not Availabe at this time";
                            var marker = createMarker(point, name, damtype,damheight,nofCanals,capacity,presentCondition
                            ,maintainAuthority,surface,catchment,purpose, nofFamerFamilies,nearestTown,rainfallData,imagePath);
                            // var marker = createMarker(point, name, damtype,damheight);
            function createMarker(point,name,damtype,damheight,nofCanals,capacity,presentCondition
            ,maintainAuthority,surface,catchment,purpose, nofFamerFamilies,nearestTown,rainfallData,imagepath) {
                // var marker = new GMarker(point, customIcons[type]);
                var marker = new GMarker(point,iconBlue);
                var tabs = [];
                // Create tabs and add them to the array

                var html = "<b>" + name + "</b> <br/>"  +'<small>Dam Type:</small>'+ '<small>'+damtype+'</small>'
                    + "<br/>" +'<small>Dam Height:</small>'+'<small>'+damheight+'</small>'
                    + "<br/>" + '<small>Num Of Canales:</small>'+'<small>'+nofCanals+'</small>'
                    + "<br/>" + '<small>Capacity :</small>'+'<small>'+capacity+'</small>'
                    + "<br/>" + '<small>Present Condition :</small>'+'<small>'+presentCondition+'</small>'
                    + "<br/>" + '<small>Authority :</small>'+'<small>'+maintainAuthority+'</small>'
                    + "<br/>" + '<small>Surface :</small>' +'<small>'+surface+'</small>'
                    + "<br/>" + '<small>Cathment :</small>' +'<small>'+catchment+'</small>'
                    + "<br/>" + '<small>Purpose :</small>' +'<small>'+purpose+'</small>'
                    + "<br/>" + '<small>No Of Farmer Families :</small>' +'<small>'+nofFamerFamilies+'</small>'
                    + "<br/>" + '<small>Nearest Town :</small>'+'<small>'+nearestTown+'</small>'
                    + "<br/>" + '<small>Rain Fall Data :</small>' +'<small>'+rainfallData+'</small>';

                var content = '<div id="info">'+
                    '<img src="'+imagepath+'" alt=""  width="150px" height="150px"/>' +
                    '<h2>'+name+'</h2>' +

                tabs.push(new GInfoWindowTab('Information', html));

                var tab2 = new GInfoWindowTab("Location", content);
                GEvent.addListener(marker, 'click', function() {

                return marker;


    <body onload="load()" onunload="GUnload()">
        <div id="map" style="width: 100%; height: 100%"></div>



1.This is how we use the obtained Google key.


2.Create the marker for our map


3.The locad function…


4.Extracts data from the xml


5.The rest of the code responsible to make the marker a fancy one Smile..

If you did it with me correctly then enter this url on your browser..



this will be the out put






So We make it (Note that if want those fancy images there add those in to folder call images/YOUR_IMAGE.jpg.png  ext)…


How to make Simple J2ME Game Canvas Navigation Controller.

October 2, 2010 Leave a comment

Howdy,This is the first blog post of mine.I’m highly interested on Java and opensource technology.This blog will talk about wide spread of Java related things.It’s all about my experiences.I have a thirsty about new technology and get use of them to create infinite solutions make the life easier.

This is about j2me Gaming.This post will guide you to make a simple game canvas controller.I do not  know the title or topic is fully correct or not but below i’ll show the thing that i mean to create.

Initial Requirements.

  1. Java installation JDK 1.5 or higher .
  2. sun java wireless toolkit-2.5.2_01.
  3. preferred IDE for your convenience.


  • Device Configuration CLDC-1.1
  • Device profile MIDP-2.0

Lets Start….

1.Create a class call  NavigationCanvas  and it extends GameCanvas class and implements the Runnable interface.

below show the source code for NavgationCanvas class.

import javax.microedition.lcdui.Graphics;

import javax.microedition.lcdui.Image;





* @author kalpa


public class NavigationCanvas extends GameCanvas implements Runnable {

private boolean isPlay;                 // Game Loop runs when isPlay is true

private long delay;                         // To give thread consistency

private int currentX, currentY;         // To hold current position of the ‘X’

private int width;                      // To hold screen width

private int height;                     // To hold screen height

private Image facesImage;

private Sprite facesSprite;      // Constructor and initialization

public NavigationCanvas()



width = getWidth();

height = getHeight();

currentX = width / 2;

currentY = height / 2;

delay = 20;


// Automatically start thread for game loop

public void start(){

                  isPlay = true;

                  Thread t = new Thread(this);



public void stop(){

              isPlay = false;


// Main Game Loop

public void run(){

              Graphics g = getGraphics();

              while (isPlay == true)







      } catch (InterruptedException ie)





// Method to Handle User Inputs

private void input()


       int keyStates = getKeyStates();

       // Left

         if ((keyStates & LEFT_PRESSED) != 0)


        currentX = Math.max(0, currentX – 1);


      // Right

      if ((keyStates & RIGHT_PRESSED) != 0)


        if (currentX + 5 < width)


currentX = Math.min(width, currentX + 1);



// Up

if ((keyStates & UP_PRESSED) != 0)


currentY = Math.max(0, currentY – 1);


// Down

if ((keyStates & DOWN_PRESSED) != 0)


if (currentY + 10 < height)


currentY = Math.min(height, currentY + 1);




// Method to Display Graphics

private void drawScreen(Graphics g)



g.fillRect(0, 0, getWidth(), getHeight());


g.drawString(“X”, currentX, currentY, Graphics.TOP | Graphics.LEFT);




This clsaa is the main midlet that start the application.

import javax.microedition.lcdui.Display;

import javax.microedition.midlet.*;


* @author kalpa


public class Navigation extends MIDlet


private Display display;

public void startApp()


display = Display.getDisplay(this);

NavigationCanvas gameCanvas = new NavigationCanvas();




public void pauseApp()



public void destroyApp(boolean unconditional)






This will give you a controllable cross.

You can see the code in below vedio also 🙂

Categories: j2me Tags: , ,