July 16, 2024


Forever Driven Computer

Big data infrastructure internship | Adaltas

Big data infrastructure internship | Adaltas

Occupation description

Big Info and distributed computing are at the main of Adaltas. We accompagny our associates in the deployment, servicing, and optimization of some of the largest clusters in France. Considering that just lately we also supply aid for day-working day functions.

As a great defender and lively contributor of open resource, we are at the forefront of the details system initiative TDP (TOSIT Info Platform).

Throughout this internship, you will lead to the advancement of TDP, its industrialization, and the integration of new open source components and new functionalities. You will be accompanied by the Alliage pro workforce in demand of TDP editor support.

You will also operate with the Kubernetes ecosystem and the automation of datalab deployments Onyxia, which we want to make obtainable to our clients as effectively as to students as aspect of our teaching modules (devops, huge knowledge, etc.).

Your skills will enable to expand the expert services of Alliage’s open resource aid providing. Supported open up supply parts include things like TDP, Onyxia, ScyllaDB, … For all those who would like to do some website do the job in addition to large facts, we already have a pretty practical intranet (ticket administration, time management, superior research, mentions and linked content, …) but other great functions are expected.

You will follow GitOps launch chains and create article content.

You will perform in a staff with senior advisors as mentor.

Company presentation

Adaltas is a consulting company led by a crew of open up supply experts concentrating on info management. We deploy and work the storage and computing infrastructures in collaboration with our customers.

Associate with Cloudera and Databricks, we are also open up supply contributors. We invite you to browse our website and our many technological publications to master more about the organization.

Skills required and to be acquired

Automating the deployment of the Onyxia datalab demands information of Kubernetes and Cloud indigenous. You will have to be snug with the Kubernetes ecosystem, the Hadoop ecosystem, and the distributed computing model. You will learn how the standard factors (HDFS, YARN, object storage, Kerberos, OAuth, and many others.) perform together to fulfill the utilizes of significant data.

A excellent knowledge of utilizing Linux and the command line is expected.

All through the internship, you will study:

  • The Kubernetes/Hadoop ecosystem in buy to lead to the TDP venture
  • Securing clusters with Kerberos and SSL/TLS certificates
  • High availability (HA) of products and services
  • The distribution of sources and workloads
  • Supervision of expert services and hosted programs
  • Fault tolerant Hadoop cluster with recoverability of missing info on infrastructure failure
  • Infrastructure as Code (IaC) by way of DevOps tools this sort of as Ansible and [Vagrant](/en/tag/hashicorp- vagrant/)
  • Be cozy with the architecture and procedure of a info lakehouse
  • Code collaboration with Git, Gitlab and Github


  • Become familiar with the architecture and configuration procedures of the TDP distribution
  • Deploy and take a look at safe and highly readily available TDP clusters
  • Contribute to the TDP understanding foundation with troubleshooting guides, FAQs and content articles
  • Actively contribute thoughts and code to make iterative advancements to the TDP ecosystem
  • Analysis and review the discrepancies in between the major Hadoop distributions
  • Update Adaltas Cloud making use of Nikita
  • Contribute to the progress of a instrument to accumulate customer logs and metrics on TDP and ScyllaDB
  • Actively add ideas to establish our aid resolution

Extra details

  • Spot: Boulogne Billancourt, France
  • Languages: French or English
  • Beginning date: March 2023
  • Length: 6 months

Significantly of the digital planet operates on Open up Resource software program and the Huge Information market is booming. This internship is an possibility to acquire valuable practical experience in each domains. TDP is now the only truly Open up Source Hadoop distribution. This is a wonderful momentum. As aspect of the TDP group, you will have the risk to study just one of the core large facts processing models and take part in the improvement and the long run roadmap of TDP. We consider that this is an fascinating possibility and that on completion of the internship, you will be completely ready for a profitable career in Massive Data.

Tools out there

A laptop with the following properties:

  • 32GB RAM
  • 1TB SSD
  • 8c/16t CPU

A cluster built up of:

  • 3x 28c/56t Intel Xeon Scalable Gold 6132
  • 3x 192TB RAM DDR4 ECC 2666MHz
  • 3x 14 SSD 480GB SATA Intel S4500 6Gbps

A Kubernetes cluster and a Hadoop cluster.


  • Income 1200 € / thirty day period
  • Cafe tickets
  • Transportation pass
  • Participation in one particular global conference

In the previous, the conferences which we attended consist of the KubeCon organized by the CNCF basis, the Open up Supply Summit from the Linux Foundation and the Fosdem.

For any request for extra info and to post your application, be sure to call David Worms: