Saturday, February 12, 2022

Copying files from host to container and vice versa

TLDR; You can use base64 encoding/decoding to copy files

Here is a simple trick to deal with the menace of getting files from the host to container or vice versa. This can even work with VMs running inside your host. We usually use ssh to connect to VMs or containers. When working with the former ssh itself has solutions (like scp) to do the same thing. But it can still be done using the technique I am about to explain.

So in most popular docker images (linux based) there will always be a base64 utility program. (Or it can be installed via any package manager utility like apt or yum). The same utility program exists on several other popular operating systems out-of-the-box. And even if it not there, (especially on Windows), there are other applications that provide the same utility program. For e.g. git bash command line.

The steps are simple. Say you have a file called hello.txt with some text.

1. Run the base64 utility to encode the file.
$ base64 hello.txt
2. Copy the output to your clipboard.
3. In your target container instance, in an exec session, you need to do the following:
$ echo <base64_text> | base64 --decode - > hello.txt

The caveats


1. The container image should be linux based, and, have a base64 utility installed. If not installed there must be a script-only solution...will share when I find it.
2. In this example, I based the base64 binary based on what I experienced. Some binaries may have a different set of arguments. Please consult the help docs
3. You need to be able to write on the container file system.

NB: #3 is the usual bummer. But if you are able to write on the container file system, why not just use exec sessions⁉️

So why does it work?


I do not have a solid answer. You can youtube about it to understand how the algo works. All I know is that the encoding process is based on 64 ascii characters. All the 64 characters can be typed using a simple english keyboard (US layout).

Incidentally base64 is also a very popular with data transmission over the internet. Because it is mostly ascii based. I learned this the hard way when I explored SMTP. protocol and servers

So any digital sequence of bytes can be encoded, transferred and, then reformed using base64. That is the underlying principle of why this works.

I conclude this post with an interesting exercise. Suppose there is a file with content inside it. This file also has some set of file permissions. How do you get a file across (to a container or elsewhere) with the same permission set?

Saturday, February 5, 2022

Cloud-init or userdata

TLDR; I explored the cloud-init/userdata. It is a time-saving feature, mostly used when creating VMs in the cloud. (It is perhaps available for other distros as well).

To start off I would have to say that cloud-init is mostly a concept/feature associated with ubuntu server distros. I have been installing these distros a lot and have seen a lot of console output with this term. But I never really understood much less appreciated its significance, until I explored cloud services such as AWS, DigitalOcean, etc. On cloud services, this feature is commonly referred to as "userdata". (Not all cloud services use that term, or probably have this feature).

As the name suggests it is meant to initialize your VM instance to the required state for you to go about your business - exploring or running software including:

1. Updating software
2. Installing favorite tools - e.g ifconfig in linux
3. Creating user account, setting up hostname, etc...

That is all you need to know. I will give you a quick info about how I came to experience this feature.

So I was exploring SSDP protocol. In short, it is a way programs can find other programs on the network. There are many use cases for this kind of feature in a program. Most commonly it forms an essential part of similar programs trying to decide among themselves who should be a leader or who should be a slave.

There was a python package which implemented this protocol - ssdpy. This is a python package; you have to install this manually. On ubuntu server (v20.04) python and pip doesn't come ready-out-of-the-box. You have to install them manually after running your apt updates/upgrades.

A developer exploring cloud services would normally find surprises such as the ones mentioned above. But after all that is done, you explore the ssdpy program. This program has CLI commands to start a SSDP server which can publish some service (usually on the same host machine), and, a discovery program that can discover this. You run the discovery program from a different VM in the same network.

However, SSDP doesn't work in the cloud. To test, I spun up two VMs, came up with the above conclusion and then quickly destroyed the VMs so I don't incur operational costs. But then, I thought about testing it with a different set of options.

So basically all those commands required to setup ssdpy needed to be run, on two VM instances. It seemed apt to use the "userdata" feature here. Further, along with this userdata initialization, I also downloaded some bash scripts from my github gist intended to send notifications to my mobile phone when the VM was "ready".

Final verdict about SSDP is still the same - SSDP doesn't work in the cloud. I am not going to answer why that is so...This post was a short intro into "userdata", or cloud-init.