Tuesday, January 7, 2014

IBM's Big Insights training notes


IBM’s Big Insights: A primer


Recently I had the opportunity to attend a training session about IBM’s Big Insights, in November 2013. Below are my notes about this product.


What is Big Insights in a nutshell?


Big Insights is IBM’s Big Data platform. It is comprised of an all-in-one Big Data infrastructure, with IBM’s flavor of Hadoop and its ecosystem, as well as proprietary tools to query the data like JAQL and AQL, and out-of-the-box connectors and interfaces called accelerators. We’ll review these components in details in the below section.

Big Insights Hadoop infrastructure

Big Insights is composed of a Hadoop infrastructure (independent from vendors like Cloudera). It is using a released version of Hadoop that is well-tested, usually a bit older from trunk. However it differs from the Apache version in some ways also. Big Insights comes integrated with:
-       GPFS (IBM’s version of HDFS) for its file system
-       Adaptive Map Reduce, an enhanced version of MR that attempts to optimize task executions, by way of using automatic job tuning of speculative execution and Task JVM reuses. Map Reduce tasks become aware of the global state of the job they are working in. This helps balance the workload across Map tasks. 
-       Zookeeper, HBase, Hive, Pig

Of note is the fact that Big Insights is not bundled with Cloudera’s CDH anymore; IBM has its own version of Hadoop.

New query language: JAQL

Big Insights offers a language called JAQL, a functional language that can interface will of all the Big Insights tools. It provides API's (or modules) for reaching out to external IBM and 3rd party tools, such as relational databases, indexing services, text analytics, machine learning etc. JAQL stands for Json Query Language, because it is represented via Json. Similar to Pig, Jaql is automatically taking care managing the complexities of the MapReduce world to optimally perform the work. However it also manages deep level nested semi-structured data.
Jaql can be executed either from its own shell, or from within Eclipse.

Big Insights Applications

Big Insights provides an environment for developing and executing applications. A business user can launch existing applications from the Web console, supply any input parameters and view results.  These applications may be developed using Big Insights’ development tooling which enables programmers to publish completed applications through the Web console.
The BigInsights Eclipse tools include wizards, code generators, context-sensitive help, and a test environment to simplify your development efforts.
Workflow applications are run by Oozie as a workflow job.

Big Sheets

Big Insights also comes with a spreadsheet-like interface to interact with Big data in a manner business users would use Excel. To do so, it presents a familiar interface (e.g. Pivot, Union, Intersection functions) that allows users to gather, filter, combine, explore, and visualize data from various sources. Big Sheets has been designed to be used by non-technical professionals to rapidly gather insight (BigSheets executes work on a simulated environment of sample data first) and analysis from huge amounts of data, and to be able to act on those insights in a timely manner. No need to understand database schemas, no need to understand a query language. And Big Sheets conveniently has a built-in visualization module to chart and publish the results.
Also, the nice thing about it is that Big Sheets is integrated natively with the other Big Insights components, so it’s easy to navigate between the different tools that Big Insights provides; e.g. create an ETL job in Jaql and export the results to Big Sheets..


Big Data Accelerators

Big Insights bundles in some pre-built components for specific solutions to accelerate development on certain specific use cases. The accelerators generally provide business logic, data processing and visualization. An example of this is the Social Data Analytics accelerator, providing  a set of predefined elements as workbooks and dashboards to analyse social data.

Other Big Data tools

The IBM Big Data platform is comprised of Big Sheets, but also other tools like Infosphere Streams for low latency data, and an MPP (Massively Parallel Processing) database. The IBM ecosystem also seems to support Big Data: R is supported in Big Insights, Cognos supports Hive, Netezza integrates with Streams. These systems offer complementary analytical approaches.

IBM offers a free downloadable virtual machine to play with Big Insights.

Overall a good experience, although one can get easily lost by the sea of products IBM offers. On the other hand  tools like Big Sheets and the Accelerators seem very valuable.

49 comments:

  1. Really a valuable content, keep sharing post like this. It will be helpful to many like me in finding the institute for Hadoop training chennai velachery

    ReplyDelete
  2. Genuinely a critical substance, keep sharing post like this. It will be valuable to various like me in finding the association forhadoop training in chennai | hadoop training in chennai

    ReplyDelete
  3. Hi admin thanks for sharing informative article on hadoop technology. In coming years, hadoop and big data handling is going to be future of computing world. This field offer huge career prospects for talented professionals. Thus, taking Hadoop Training in Chennai will help you to enter big data technology.

    ReplyDelete
  4. It was really a wonderful article and I was really impressed by reading this blog. We are giving all software and Database Course Online Training. Oracle Training in Chennai is one of the reputed Training institute in Chennai. They give professional and real time training for all students.

    Oracle Training in chennai

    ReplyDelete
  5. Informatica Training in chennai
    This information is impressive; I am inspired with your post writing style & how continuously you describe this topic. After reading your post, thanks for taking the time to discuss this, I feel happy about it and I love learning more about this topic..

    ReplyDelete
  6. Pega Training in Chennai
    Brilliant article. The information I have been searching precisely. It helped me a lot, thanks. Keep coming with more such informative article. Would love to follow them.

    ReplyDelete
  7. QTP Training in Chennai
    Thank you for the informative post. It was thoroughly helpful to me. Keep posting more such articles and enlighten us.

    ReplyDelete
  8. There are lots of information about latest technology and how to get trained in them, like Hadoop Training in Chennai have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get trained in future technologies Hadoop Training in Chennai By the way you are running a great blog. Thanks for sharing this..

    ReplyDelete
  9. I found some useful information in your blog, it was awesome to read, thanks for sharing this great content to my vision, keep sharing..
    Greens Technologies In Chennai

    ReplyDelete
  10. I was looking about the Oracle Training in Chennai for something like this ,Thank you for posting the great content..I found it quiet interesting, hopefully you will keep posting such blogs…
    Greens Technologies In Chennai

    ReplyDelete
  11. Who wants to learn Informaticawith real-time corporate professionals. We are providing practical oriented best Informatica training institute in Chennai. Informatica Training in chennai

    ReplyDelete
  12. hey nice site..learn Oracle Training we provided by Oracle Certified Experts. Best Oracle Training institute in Chennai with Job Placement. Oracle Training in chennai

    ReplyDelete
  13. Jump Start Your Career & Get Ahead. Choose sas training method that works for you. We offer an extensive list of courses in a variety of formats that make learning as easy as possible. SAS Training in Chennai

    ReplyDelete
  14. Awesome blog if our training additional way as an SQL and PL/SQL trained as individual, you will be able to understand other applications more quickly and continue to build your skill set which will assist you in getting hi-tech industry jobs as possible in future courese of action..visit this blog Green Technologies In Chennai

    ReplyDelete
  15. Nice site....Please refer this site also Our vision succes!Training are focused on perfect improvement of technical skills for Freshers and working professional. Our Training classes are sure to help the trainee with COMPLETE PRACTICAL TRAINING and Realtime methodologies. Green Technologies In Chennai

    ReplyDelete
  16. Looking for real-time training institue.Get details now may if share this link visit Oracle Training in chennai

    ReplyDelete
  17. This site has very useful inputs related to qtp.This page lists down detailed and information about QTP for beginners as well as experienced users of QTP. If you are a beginner, it is advised that you go through the one after the other as mentioned in the list. So let’s get started QTP Training in Chennai

    ReplyDelete
  18. It is really very helpful for us and I have gathered some important information from this blog.
    Oracle Training In Chennai

    ReplyDelete
  19. Oracle Training in Chennai is one of the best oracle training institute in Chennai which offers complete Oracle training in Chennai by well experienced Oracle Consultants having more than 12+ years of IT experience.

    ReplyDelete
  20. There are lots of information about latest technology and how to get trained in them, like Hadoop Training Chennai have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get trained in future technologies(Hadoop Training in Chennai). By the way you are running a great blog. Thanks for sharing this.

    ReplyDelete
  21. Great post and informative blog.it was awesome to read, thanks for sharing this great content to my vision.Informatica Training In Chennai

    ReplyDelete
  22. A Best Pega Training course that is exclusively designed with Basics through Advanced Pega Concepts.With our Pega Training in Chennai you’ll learn concepts in expert level with practical manner.We help the trainees with guidance for Pega System Architect Certification and also provide guidance to get placed in Pega jobs in the industry.

    ReplyDelete
  23. Our HP Quick Test Professional course includes basic to advanced level and our QTP course is designed to get the placement in good MNC companies in chennai as quickly as once you complete the QTP certification training course.

    ReplyDelete
  24. Thanks for sharing this nice useful informative post to our knowledge, Actually SAS used in many companies for their day to day business activities it has great scope in future.

    ReplyDelete
  25. Greens Technologies Training In Chennai
    Excellent information with unique content and it is very useful to know about the information based on blogs.

    ReplyDelete
  26. Greens Technology offer a wide range of training from ASP.NET , SharePoint, Cognos, OBIEE, Websphere, Oracle, DataStage, Datawarehousing, Tibco, SAS, Sap- all Modules, Database Administration, Java and Core Java, C#, VB.NET, SQL Server and Informatica, Bigdata, Unix Shell, Perl scripting, SalesForce , RedHat Linux and Many more.

    ReplyDelete
  27. There are lots of information about latest technology and how to get trained in them, like Best Hadoop Training In Chennai in Chennai have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get trained in future technologies Hadoop Training in Chennai By the way you are running a great blog. Thanks for sharing this blogs..

    ReplyDelete
  28. This information is impressive..I am inspired with your post writing style & how continuously you describe this topic. After reading your post,thanks for taking the time to discuss this, I feel happy about it and I love learning more about this topic
    Android Training In Chennai In Chennai

    ReplyDelete
  29. This comment has been removed by the author.

    ReplyDelete
  30. This comment has been removed by the author.

    ReplyDelete
  31. This comment has been removed by the author.

    ReplyDelete
  32. This comment has been removed by the author.

    ReplyDelete
  33. SAP Training in Chennai
    This post is really nice and informative. The explanation given is really comprehensive and informative..

    ReplyDelete
  34. Oracle Training in chennai
    Thanks for sharing such a great information..Its really nice and informative..

    ReplyDelete
  35. Selenium Training in Chennai
    Wonderful blog.. Thanks for sharing informative blog.. its very useful to me..

    ReplyDelete
  36. Data warehousing Training in Chennai
    I am reading your post from the beginning, it was so interesting to read & I feel thanks to you for posting such a good blog, keep updates regularly..

    ReplyDelete
  37. Whatever we gathered information from the blogs, we should implement that in practically then only we can understand that exact thing clearly, but it’s no need to do it, because you have explained the concepts very well. It was crystal clear, keep sharing..
    Websphere Training in Chennai

    ReplyDelete
  38. Oracle DBA Training in Chennai
    Thanks for sharing this informative blog. I did Oracle DBA Certification in Greens Technology at Adyar. This is really useful for me to make a bright career..

    ReplyDelete
  39. hai,i have to learned to lot of information about java Gain the knowledge and hands-on experience you need to successfully design, build and deploy applications with java.
    Java Training in Chennai

    ReplyDelete
  40. hybernet is a framework Tool which helps in Functional and Regression testing of an application. If you are interested in hybernet training, our real time working.
    Hibernate Training in Chennai,

    ReplyDelete
  41. Looking for real-time training institue.Get details now may if share this link visit
    Spring Training in chennai
    oraclechennai.in:

    ReplyDelete
  42. Nice site.... refer this site .if Our vision succes!Training are focused on perfect improvement of technical skills for Freshers and working professional. Our Training classes are sure to help the trainee with Realtime methodologies.
    Oracle Rac Training Chennai
    haddoop:

    ReplyDelete
  43. Job oriented form_reports training in Chennai is offered by our institue is mainly focused on real time and industry oriented. We provide training from beginner’s level to advanced level techniques thought by our experts.
    forms-reports Training in Chennai

    ReplyDelete
  44. This is really an awesome article. Thank you for sharing this.It is worth reading for everyone. Visit us:Oracle Training in Chennai

    ReplyDelete
  45. very nice blogs!!! i have to learning for lot of information for this sites...Sharing for wonderful information.Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing.
    Oracle DBA Training in Chennai

    ReplyDelete
  46. Really awesome blog. Your blog is really useful for me. Thanks for sharing this informative blog. Keep update your blog.
    SAP Training in Chennai

    ReplyDelete
  47. Execellent ! I am truly impressed that there is so much about this subject that has been revealed and you did it so nicely
    sas online training

    ReplyDelete

Note: Only a member of this blog may post a comment.