The Age of Big Data Analytics

A Report of the World Economic Forum in Davos in 2012, termed data as a new class of economic asset, which touches all aspects of society. Almost 90 per cent of the data in the world, as of 2013, has been created in the last two years.

wef big dataThe Internal and Social networks have enormously increased the data available on the Net on such a huge scale that observers have described that we are in the Age of Big Data. It is estimated that there will be 44 times as much data in computer devices over the next decade, reaching 35 zetta bytes in 2020. A zetta byte is one trillion gigabytes (1 followed by 21 zeros).

Indicating this, IBM’s former Chairman Samuel Palmisano said advanced computation and analytics would enable us to make sense of the enormous data in real time. Online data indexed by Google alone is estimated to have increased from 5 exabytes (1 exabyte equals 1 million bytes) in 2002 to 280 exabytes in 2009, which works out to 56-fold increase in seven years. In contrast, the growth in computing in terms of Moore’s Law in the same period was only a 16-fold increase.

Moreover, 20 per cent of the Internet search queries typically yield new data. Samuel Arbesman, a Harvard mathematician, has explored the length of time it takes for half of the facts to become obsolete. The study called scientometrics, provides a quantitative analysis of science, which examines why everything we know has an expiration date. Arbesman has cited the analogy of predicting the decay of half the atoms in a chunk of uranium over a period. In a somewhat similar manner, data stored online would become obsolete, given the unprecedented inflow of new data.

The huge increase in data is attributed to three main reasons. One, rise of mobile phones and tablets including services based on location-oriented mobile devices; two, enormous increase in video uploading on the Internet; and three, rise of the so-called Internet of Things, which would make devices in daily use (e.g. refrigerators) generate data on their own.

Internet Of Things
Image-credit:www.perforce.com

Let us first look at the rise of mobile devices. Their increase has been phenomenal in recent times. The number of mobile devices is expected to exceed the human population by 2013. Cisco, an Internet company, predicts that by 2016 there will be 10 billion mobiles around the world and that the networks will be carrying 130 exabytes of data each year, equivalent of 33 billion DVDs.

The forecast is justified going by the recent trends: mobile data traffic in 2011 was eight times the size of the global Internet traffic in 2000. Moreover, smartphones have added to the flood of data. By 2015, the number of smartphones in the world will be up to one billion. An average smartphone uses 150 megabytes of data per month and it is expected to rise to 2.6 megabytes by 2016. By then, 60 per cent of the mobile users will be using more than one gigabyte of data per month. In a related traffic to smartphones, has tripled to 34 million since 2011. Finally, the introduction of 4G phones is expected to pick up more data than the earlier generations of mobile phones. Google’s Android phones are increasing by 850,000 per day or 10 devices per minute and will account for 15 per cent of Google’s search volume from mobile devices.

youtube bandwidthThe second major reason for data increase is the huge input of video from camera phones and other user-friendly devices. Cisco predicts that 70 per cent of mobile data traffic will be video by 2016. You Tube is the second largest search engine today. One report points out that the video uploaded to YouTube in 60days is more than what three major US networks created in 60 years! With Google pushing for open standard for television on the Net, the video surge will continue unabated.

Third, data will start flowing form everyday objects like refrigerator and microwave ovens in the Internet of Things.

Data: A Gold Mine of the Internet Age

What is Data Mining ?

Sorting out data according to predetermined categories is easy with say 10 data-sets but becomes a challenge if the number involved is say five billion. Only powerful computers can handle this sort of processing.

Data on the scale of billions has become common place. This needs automation and that is what data mining is all about. Data mining is the process by which new information is gleaned by examining large databases.

Processing becomes a program in machine learning according to given instructions. The techniques of data mining can be broadly characterized according to the targets given. Let us see some examples.

First, the target could be detecting anomalies in a huge pile of identical returns pertaining to property details or tax levied or claimed. Even one in thousand, which is different from the rest of the files, will be significant. Second, learning by association could be a target. It is best understood by sales strategies. For instance, if you bought a music player, then advertisements offering CDs and the latest hits will be suggested. Or if you have been mostly buying tickets to crime thriller movies, the newly released crime movies would pop up on your screen. A well-known online bookseller invariably adds that those who bought this book which you have ordered have also bought the other titles listed on your screen. It is an inducement for you to consider and buy.

Third, the data on the goods ordered would be used to group the buyers and their location, if possible, for further marketing strategies. For instance, the data on dental equipment sold online would be used to build up a profile of demand for such equipment. A lot of other conditions should qualify such projections. Those who buy nets need not be fishermen: they could be anglers who have fishing as a hobby.

Fourth, computers can be programmed to locate spam on the basis of objectionable or unwanted mails. Lastly, building predictive models based on the data gathered has become a professional exercise. Such projections of consumer demand, weather patterns and production and sales trends have been found quite useful.

Though primarily used for advertising on the Internet, data mining is fast becoming a discipline on its own predicting the probable trends in many areas of the uncertain world of today.

data analysis servers

How Data Mining affects us all ?

Data mining has its impact on individuals. It allows companies and governments to use the information one provides to reveal more than one thinks. Even as we gather more data than we can handle, powerful computers especially those working with social networks, will gobble up the huge mountains of data and try to make some sense of it, often in response to corporate demands.

The International Data Corporation foresees a high technology industry in the convergence of mobile devices, social networks and cloud-based computing and data storage. Spending on new technologies is growing at six times that of traditional computer servers and PCs. It will pose new demands, especially for storage of data. Cloud computing has come in time. Companies that provide cloud servers to business are expected to get more than half of the spending. On privacy, the Corporation says in a report that while there is increased awareness of privacy issues, there is still no sense of immediate urgency. Users trust the system and the convenience it provides as long as no harm is inflicted on a personal level.

A survey by Ericsson Consumer Lab finds that users feel safe sharing music playlists or their beliefs on religion etc., but are least inclined to share data about their medical records or finances.

Big Data Analytics in India

Computer facilities to handle data are coming up in India in a big way. The Indian grid (network of computers that shares resources) called GARUDA (Global Access to Resources Using Distributed Architecture) has a computing capability of close to 70 teraflop (a teraflop is a trillion floating point operations per second).

It may reach exascale (a billion billion flops) by 2016. GARUDA will facilitate data exchange and analysis over a wide range: health care, bioinformatics and climate modeling etc.). India has also a National Knowledge Network which connects over 700 institutional networks. In addition there is ERNET which is a national network of academic institutions in the country.

An overview of Big Data and Data Mining is provided in the video below :

Kaalari capital launches Kstart

Kalaari Capital, a Venture Capital firm which has more than US$ 650 million (Rs 4000 Cr) in assets under management, last week announced the launch of it’s incubator called Kstart. Kstart seeks to nurture start-ups and help accelerate their growth. The hub is based at the International Technology Park in Whitefield, Bangalore.

Kaalari which derives it’s name from the ancient marital art of Kaalaripayattu has notched up several high profile investments in Indian startups in the past few years. Some of their high profile investee’s include Myntra.com, Via.com, Snapdeal.com, Yourstory.com and Zivame.com. Kalaari Capital is headed by Vani Kola who is counted among India’s most successful entrepreneur and serves as it’s managing director.

Startup India – Join the Party !

The Starup India mission announced by Narendra Modi last month, is expected to give a fillip to the entire sector in the coming years with a host of incentives like annual tax rebates and a US$ 1.5 billion government fund to invest in new startup’s . Indian VC’s are looking to capitalize on this opportunity and eager to invest in startups with unique ideas which can address unmet needs in global markets. The business environment in India is very tough with stringent regulatory norms and various hurdles.  Startups today require a lot of hand holding before they can mature into a sustainable business.This is where Kstart steps in.

kstart logo

The Kstart Program Launch

The program was launched on 5th February 2016 at the International Technology Park in Whitefield, (ITPL) with a one-on-one chat between Mr Ratan Tata and Mr Vani Kola, the MD of Kaalari Capital. The chat which started at 11:00 am and lasted till noon had Mr Ratan Tata speaking about his investments in Startups and how he see’s the startup sector shaping up in India.

vani kola ratan tataAfter the session concluded , it was followed by a short speech by the Art Curators who have furnished the Kstart office with an eclectic mix of modern art and installations.

art-vc

The Kstart incubator is decked up with Murals and Electronic Art installations that lend it a hip and haute couture finish. It should in time emerge as an art connoisseurs delight for it’s varied use of dynamic art elements. For those who missed out on the event , you can take a look at some of the high end art decorating the incubator below. More than 12 Artists from around the world collaborated for months to produce these art works.

art on walls
A mural on one side of the conference hall.

 

black white office
A corner in the office decked up in Black and White interiors and furniture.

 

led
An interactive LED Display.

The opening session was followed by a session which had Bangalore based founders Bhavish Aggarwal (Ola Cabs), Mukesh Bansal (Myntra), Naveen Tewari (Inmobi), Kunal Shah (Freecharge) in a free wheeling chat with David Rowan, Editor of Wired Magazine UK. Yourstory has a detailed write-up about what they discussed here.

david rowan wired startup

The session was followed by a round of sumptuous snacks that included cakes, rolls and tuna sandwiches (a rarity in India!).

tuna sandwhich

The afternoon session had presentations by Google India and IBM Watson and concluded with a talk by David Rowan , where he spoke about 10 trends that are shaping the digital economy around the globe. That wraps up a brief overview of the launch event of Kstart by Kalaari capital. Stay tuned for more updates about Startups in India!

The Challenge of Big Data

Data storage is increasing by leaps and bounds thanks to the vast spread of various sensors, transactions and clicks online. The emerging challenge is data analysis to make sense of it in quick time in response to changing demands. The challenge is indicated in the definition of big data that it is too big, too fast and too hard for the existing tools to process. We have today data systems which handle data on a petaflop scale (a thousand million million flops), though fast processing is demanded by situations such as fraud detection or sale of goods.) What is wanted is machine-learning algorithms that are easier for common users.

The US Government has announced a ‘Big Data’ initiative to advance the state-of-the-art core technologies needed to collect, store, preserve, manage, analyse and share high quantities of data. For example, the data from the 1000 Genomics Project will be put into the cloud. The world’s largest set of data on human genetic variation is a 200 terabytes. Another type of data stored will be related to our planet which will be of great interest to geo-scientists. All of this data will be free hosted by the Amazon Web services cloud.

Global pulse, a United Nations initiative, wants to leverage Big Data for global development. It plans to get digital warning signals to guide assistance programmes in advance. The warning is timely as algorithms increasingly determine an expanding space in our lives.

There can be a downside as well to the emergence of data deluge. Computer virus will have greater scope to attack. Identity impersonation may increase. And intrusions into privacy may go up. These are inevitable consequences of a historic change in the way computers will handle data in the near future.

In 2011, a total of 1.8 trillion gigabytes of data per day was created. Significantly, three-fourths of it was produced by ordinary consumers. The trend will continue as people expect almost every service from the Internet. The meteoric rise of data on the Internet has a profound impact on the world’s energy resources and pollution levels.

the challenge of big data

Big Data and Energy Demand

One of the features about the search engine is the enormous power it consumes for its work. A video on YouTube, for example, showed one of its data centres, where 45,000 servers were placed. It was disclosed that Google has placed on uninterrupted power supply at each server instead of a centralized supply source. It has been stated that a typical search needs 0.3 watt hours of electricity, which is equivalent of a 100-watt light bulb to be lit for ten seconds.

For handling a billion searchers a day, it needs 12.5 million watts Hence it is imperative to save on power. Google recently disclosed that it needs 260 million watts (equivalent at one fourth the output of a typical nuclear power plant) for its data centres around the world. This is considered enough to power 200,000 households. As the Internet traffic is expected to increase four-fold in the next five years, Google has set up a power plant on the Baltic coast of Finland. Globally, 1.5 per cent of the total electricity generated is used by Google.

Social media and search engines make a huge demand on the world’s energy resources, if only to keep themselves free from potential breakdown. Most of the world’s three million data centres, where mega servers handle data from the Internet, consume vast amount of energy in a wasteful manner. Worldwide, the digital data centres use about 30 billion watts of electricity, equal to the output of about 30 nuclear power plants. Though a data centre runs at maximum capacity, it typically uses an average only 6-12 per cent for the computational tasks of its servers. The over-provisioning is made simply to keep the servers running for fear of a crash even for a few seconds. Many servers are labelled idle or comatose by engineers but no attempt is made to stop them from idling. Contrary to popular nation, cloud computing does not save energy. The cloud just changes the location, where applications are carried out.

Moreover, together with back-up generators and batteries, the deployment of power sources pollutes the atmosphere. Several centres have been found violating air quality regulations.

There are several challenges posed by Big Data. First, identifying which data are relevant is a problem. Second, perfect information is invariably not available to base corporate decisions, which are often driven by leadership and windows of opportunity as perceived by the managements. Third, formulating the right questions will be more important than going by the data collected in general.

It is tempting to recall the prescient words of T.S. Eliot who asked, “Where is the wisdom we have lost in knowledge? Where is the knowledge we have lost in information?” To this one may add the question posed by a critic, viz. “Where is the information we have lost in data?” Certainly, we have created more than what we can comprehend, much less utilize. Perhaps it is a tribute to human ingenuity. As Danny Hillis, inventor of supercomputers says, the greatest achievement of human technology is tools that allow us to create more than we understand.

WordPress Theme Features to Consider while building a Theme ?

What Colors will your theme be available in ?

  • black
  • blue
  • brown
  • gray
  • green
  • orange
  • pink
  • purple
  • red
  • silver
  • tan
  • white
  • yellow
  • dark
  • light

Hwo many columns will a theme have ?

  • one-column
  • two-columns
  • three-columns
  • four-columns
  • left-sidebar
  • right-sidebar

What kind of a Layout will your theme be built on ?

  • fixed-layout
  • fluid-layout
  • responsive-layout

What are the some of the Features that you can incorporate into your theme ?

  • accessibility-ready
  • blavatar
  • buddypress
  • custom-background
  • custom-colors
  • custom-header
  • custom-menu
  • editor-style
  • featured-image-header
  • featured-images
  • flexible-header
  • front-page-post-form
  • full-width-template
  • microformats
  • post-formats
  • rtl-language-support
  • sticky-post
  • theme-options
  • threaded-comments
  • translation-ready

Will your theme be for a specific genre,topic or subject ?

  • holiday
  • photoblogging
  • fitness
  • seasonal
  • news
  • fashion

What framework will your wordpress theme be built on ?

  • Bootstrap
  • Thematic
  • Genesis
  • Canvas

A list of good frameworks

http://athemes.com/collections/best-wordpress-theme-frameworks/

Online Learning Resources for Freelancers and Web Developers in India

Since late 2010 there has seen a mushrooming of online resources providing free and paid access to quality resources where web developers and designers can learn to code and build websites from scratch. I

ndians are believe to constitute the second largest group of online learners on many of these websites after the US. Given below is the most comprehensive resource of online websites and tutorials to jump-start your web designing career.

These resources have been compiled in July 2014 and will be updated as and when we come across new resources. Many of these tutorial websites charge a fee for members but you can always look up a course on torrent websites to check it’s quality, before you fork out the money.

Code Academy – http://www.codecademy.com/
currently offers the following tracks for web developers : HTML/CSS, Javascript, jQuery, Python, Ruby, PHP and API’s.

PHP Academy https://phpacademy.org/
A dedicated portal for learning PHP using videos and offers several advanced PHP tutorials for users.

SQL Zoo – http://sqlzoo.net/
is a step-by-step tutorial with live interpreters, allowing access to tables using any of Oracle, SQL Server, MySQL, Access or PostgreSQL engines.You can learn any of the major databases on here for free.

Code School– https://www.codeschool.com/
has many different paths like Ruby on Rails, JavaScript, jQuery, HTML/CSS, Sass, iOS development etc.

Code Combat – http://codecombat.com/
teaches you how to code while playing a game. It offers more than 6 different games to learn Javascript, python, coffescript, clojure, lua and Io.

Learnable – https://learnable.com/
This is a paid service and offers a free 14 day trial for new users, it offers tutorials on varied topics like HTML & CSS, JavaScript, PHP, Ruby, Design & UX, Mobile and Workflow.

Team Treehouse – https://teamtreehouse.com
Is a well designed site that offers a free one month trial for the following online courses : WordPress Development, jQuery Basics, Web Design, Front-end Web Development, Programming with JavaScript & jQuery, Rails Development,iOS Development, Create apps for the iPhone and iPad, Android Development, Learn Java, PHP Development, Starting a Business etc.

Tutsplus – http://tutsplus.com/
is one of the oldest online resources for web developers and comes from the trusted stable of Envato. It offers a much more detailed and in depth tutorials on varied topics organized under the following topics : Design & Illustration, Code, Web Design, Music & Audio, Photography, 3D & Motion Graphics and Business.

The Code Playerhttp://thecodeplayer.com/
plays code like a video helping people to learn front end technologies like HTML5, CSS3, Javascript, Jquery easily, quickly and interactively.

Code Avengershttp://www.codeavengers.com/
Learn how to code games, apps and websites with fun and effective interactive games. HTML, CSS and JavaScript tutorials are for beginners.

Learnstreet – https://www.learnstreet.com/
Offers free courses in JavaScript, Python and Ruby and has more than 100 interactive exercises in each course.

Udemy – https://www.udemy.com
Offers online courses on topics as varied as Web development, Yoga, Guitar lessons, or anything else.It has more than 3 million students and about 16,000 courses ! It offers both free and paid courses.

Khan Academy – https://www.khanacademy.org/computing/cs
has a dedicated section for computer programming where you learn to write programs and build games using Javascript and the ProcessingJS library. The focus of the website is more on science subjects and school work.

Plural Sight – http://www.pluralsight.com/training
Is one of the largest tech and creative learning library online and has some very good courses and content by experienced professionals. It offers both paid and free content and tutorials.

Lynda – http://www.lynda.com/
Is one of the oldest resources online and also the largest with more 4 million users. The fact that most of it’s tutorials are pirated and available for download on torrent sites shows they are among the best in business.

Alison – http://alison.com/
Alison provides free, certified courses from the World’s Top Publishers and is supported by google and microsoft.

Killer PHP – http://www.killerphp.com/
Provides beginner PHP Video Tutorials for Beginners and developers.

One Month – https://onemonth.com/
Provides the most in demand tech courses spread out over one month. Their Ruby on Rails program is rated one of the best courses online.

wordpress design

Web Design – http://webdesign.com/
provides professional web design training with WordPress. One of the few websites focused just on WordPress.

Paul Lund – http://www.paulund.co.uk/c/tutorials

A leading web developer from UK who works on wordpress and PHP has created several tutorials for freelancers.

CodeHS – http://codehs.com/
Learn to code and program with Tutors.

Udacity – https://www.udacity.com/
offers courses in Web Development, Software Engineering, Android and even an online Masters in CS from Georgia Tech.

Coursera – https://www.coursera.org/
Coursera offers one of the widest varieties of courses from some of the best universities in the world.

Edx – https://www.edx.org/
Offers free online courses on an open source platform from only the best universities in the world.

Future Learn – https://www.futurelearn.com/
Enjoy free online courses from leading UK and international universities.

Canvas Network – https://www.canvas.net/
Offers both free and paid courses on a wide variety of topics

iVersity – https://iversity.org/
is a platform for Massive Open Online Courses (MOOCs).

Online Course Aggregator.

If the above links are not enough you can explore more links at course aggregators like

Skilled Uphttp://www.skilledup.com/

Slider Rulehttp://www.mysliderule.com/

Course Talk – http://www.coursetalk.com/

Class Central – https://www.class-central.com/

Course Buffet – http://www.coursebuffet.com/

Co Working Spaces in Bangalore India

Over the last few weeks, I realized that i have out grown my current work environment and need to find a new place to work from. So I have spent the last two days looking for Co-Working space in Bangalore. Given below are few of the good ones, I came across.

Co-Working Space on MG Road, Church Street

Cobalt BLR
is a high end co-working space located off Church street, about a year old it. It costs about Rs 6,500.00 INR per month for a single seat and it is situated above the hip and happening Church Street Social. The distance from my esidence (10km) and price made this a No-go zone. Know more about it at http://www.cobaltblr.com.

Vatika Business Center

This is a very high end business center with a lot of value added services where it costs Rs 18,000 INR per month for a single seat. I saw that many established e-commerce brands like Jabong.com etc have their offices here. http://www.vatikabusinesscentre.com

Free Co-Working Spaces in Bangalore

Tie Bangalore offers free co-working spaces for 3 months for it’s members which is located in Langford road but it’s the bare minimum and I noticed that a handful use it.

bangalore alpha labs

Alpha Labs Bangalore

This is where I work from now !

Finally settled down on Alpha Labs Bangalore which fits my budget at Rs 5000 per month it’s located off J.P Nagar and is about 7Km from home. Today is my first day here and things are going smoothly.

I can see a lot of web developers , web designers and digital marketeers like me going about their work.

You can join us at http://bangalorealphalab.in/

Goal 2014

The Goal for 2014 is simple :

1. Build one free wordpress theme and release it for free on the WordPress Themes Marketplace.

2. Build one premium paid wordpress theme and release it for a nominal price on the Themeforest WordPress Marketplace.

The first theme that we develop will be a free or premium theme depending upon how good a job we do at designing and coding it. If our first theme looks good, works well and it looks like people would pay for it then we release it as a paid theme or else we give it back for free to the wordpress community who will test it out and we iron out all the bugs over time.

Time Limit

Best case scenario : We do it in 101 Days so Launch Date will be Sunday 20th October 2014.

Worst Case Scenario :We have to do it before the year ends that means we have a maximum of 173 days to go for 31st December 2014.

So the magic number will most likely be between 101 and 173 days.