Big data presentation, the cloudera Apache Hadoop solution


Posted by Hafed | Posted in Big data | Posted on 23-05-2016

Cliquer ici pour la version francaise

Recently, we had Mr Jean-Marc Spaggiari, principal Solutions Architect at Cloudera. Inc, present the cloudera Apache solution to our students at College de Bois de Boulogne. Here is a resumé of the session.

Title: The cloudera solution: A trip across the big data world

The main points presented by Jean-marc were related to the tools and ecosystem used within the cloudera solution.

Specifically, he presented the following points:

  • Cloudera manager:administration, configuration, monitoring, sécurity, service deploiement
  • Tools and software: mapreduce, Spark, Hbase, Impala, Hive, Pig
  • Cloudera as a solution: how it is being used in North America and Québec
  • Open discussion and Q/A
Seminaire bigdata BDEB

Seminaire bigdata BDEB

French version

Titre : Présentation de la solution Cloudera : un Voyage dans le monde du Big data.

Nom du conférencier: Mr Jean-Marc Spaggiari, principal Solutions Architect chez Cloudera. Inc

Introduction de la conférence et de son cadre : Dans le cadre de son cycle de séminaire en relation avec les mégadonnées, l’équipe de la FCSE du collège de Bois de Boulogne a invité Mr Spaggiari pour venir partager son expérience d’architecte principal de solutions Big Data avec Cloudera.

Thèmes abordés : les thèmes principaux qui ont été présentés sont en rapport avec les outils et écosystème du big data mais principalement les produits supportés par Cloudera. Read the rest of this entry »

Big data Introductory course at College de Bois de Boulogne – Montreal


Posted by Hafed | Posted in Big data | Posted on 27-03-2016

From January 4th to Mid march 2016, I taught an introductory course in big data technology, platforms and tools, at Collège de Bois de Boulogne in Montréal, Canada.

For the most part, it was a hands-on course.

Students needed to know at least one programming language from the following: C, C++, Java, or Python, and some familiarity with basic statistics and SQL.

Because of the introductory nature of the course, I introduced the fundamental platforms, such as Hadoop, MapReduce v1 and v2, yarn, HDFS and some other tools, such as Pig and Hive. Afterwards, the course introduced cloudera manager as an administration tool. The students were required to install small-sized clusters using either cloudera manager, Ambari or vanilla install.

For the most part, the students definitely were able to carry out this task. Some sample installation reports are listed below:

TP – Mise en Place d’un cluster_hortonworks_CBH

Students were also required to choose a topic of their own for a final project. The application domain was to be based on the students’ own interest. Overall, most students saw the final project as an opportunity  to apply what they learned in the class for their own needs, either for their future work requirements or for the upcoming courses in the big data specialization at College de Bois de Boulogne. Some of the projects are listed below:

Hadoop ProjetSession Charles Brisson, Mario Nadon et Yadong Wang
BD3ProjetSession_Yvon Cadieux_Angelo Fernandes
Présentation Éric TREMBLAY et Raoul_kouanda
PrésentationYacine BELHOUL, Abdellilah NAFIA et Cadrick NOUTCHA
Projet de Session Khedidja Seridi et salim Rahali
TP2_session Reda Louahala et Albert Zhu

Overall, it was a challenging, yet a very satisfying course for most students.