deostroll developer blogs

Saturday, September 13, 2025

[Rant] A gmail google takeout frustration

⚠️ The intention here is not to point fingers at products or an organization.

TLDR; I tried to "cleanse" my gmail inbox, and, gave up. But I discovered some pythonic support related things along the way. Respect for that. 🙏

If you are someone from my generation, I am sure statistically speaking there must be around 50k unread emails in your inbox. I do not want to explain why that is so. It is a fun exercise to try to understand why the dire state of such email inboxes. But in this blog post, I guess it is enough to sum it up by stating that most of us are victims of capitalism and the communication revolution.

I set out on the journey of reducing my Google One storage foot print. Somewhere in one of Google One's web pages, we can see product wise usage of the storage. (Between Photos, Gmail, Drive, etc). Gmail wasn't the culprit here. Gmail had some sizable chunk of data, but, Photos was taking 7x more than that.

Therefore I figured out what I could cull from my gmail inbox. There are a lot of things I need to keep - like bank statements, receipts, personal communications with people in my friends circle, etc. There are a lot of things I can discard - all of those facebook notification emails, quora email digests, etc.

At this stage, there is no point in wondering why the inbox is flooded with these. At one point I was active into social media, and, therefore I never considered those emails as junk. But now they are! Oh, how the times have changed!!

I assumed that there could be a way to estimate what I am about to delete, and, then actually delete it. In other words, I could say that I wanted to delete which I deemed unnecessary, and, not something accidentally.

My plan was to do a kind of data analysis on the data. In simpler words, find out who is spamming me. Then gather the data, and, write a google app scripts to delete the emails. But I was in for a shock. Highlighting two important points below:

1. Many mail items had received time, but they are not timezone normalized. In some cases a few the formats itself were varying. There is no reason why one should expect data that way.

2. You cannot trace a mail item to a thread. Gmail servers organize emails as conversation threads. Each thread has a unique identifier. I know this since I have played around gmail inboxes via google app scripts. There looks to be an identifier but there is no correlation with what is on gmail servers.

These two reasons make it absolutely impossible for me to decide what to delete before I delete.

I used jupyter notebooks and pandas to help we out with this exercise. I was surprised that python had built-in support for working with mbox files. My choice to use a programming language such as python is personal. The language and its ecosystem is quite mature towards data analysis. I am sure others language ecosystems exist. However, after saying all this I am not trying to evangelize usage of some product or another.

It has been more than 10+ years single google takeout released. It is natural for a bloke like to be shocked. Why is there no support the way you want it to be. I think it is probably because it takes efforts to design a archival system that supports your email server. And my use case happens to be niche. People may think more about backing up their data. Not find a convoluted way to "cleanse" it.

However, I am hardly discouraging others to walk this path. In fact I encourage it. Who knows, a fresh set of eyes might discover something I could not. Just be happy to let people know if it comes to that.

Friday, June 13, 2025

My 2 paisa (cents) on digipin

It is a geo-coded addressing framework. However, to appreciate the concept and the acceptance of the digipin, you need to contrast it with the concept of India Post's PINCODE. I am purposefully omitting all that in this post.

I feel this addressing system helps a lot of "machine readable" systems. This term is a very broad term; too esoteric for the common man to understand. Ultimately, I think "machine readability" is what enables common man to shop online, order food, get turn-by-turn directions while navigation, etc. It is the whole reason why you are able to read this blog post today. However, let's not go there either.

Again, the next few lines are perhaps what a common man would not appreciate all that much...the crux of this addressing framework is a system for encoding/decoding gps coordinates mainly latitude and longitude. Somebody or a handful of people, thought of it, and, fought to make it relevant. Kudos for that.

Every Indian would want to tout this as a novel idea; its relatively novel. There is already a manifestation of an addressing framework in Google Maps. (Good luck finding what it is, much less using it). There are many others - like What3Words. (Car enthusiasts might know). You can find more if you wikipedia geocoding.

A common man can find more information on digipin. (If he or she is ⚠️determined). I found a github repo which gives a neat technical documentation. There is also DHRUVA. It is an acronym. That document details the idea behind digipin. All of these can be found of the website of India Post after a little internet search.

Final Note

All this is good innovation. But don't forget we are still a savage species. Some individuals in this species can give birth to increase its relative population. The mechanism concerned has been the same for millions of years. And it is going to be the same for millions more. Of course, today there are technologies like IVF, and, c-sections, etc. (But that is beside the point).

A subset of that population gets to witness these kinds of innovations over their lifetimes. In one sense, that is a blessing. But forget that what "these individuals" did over the years to get there. They have lied, rescued, murdered, fought, escaped, ate food, polluted, cheated, innovated, raped, invented, travelled, fake news-ed, voted, blogged, judged, punished, etc. They've done a myriad of things both good and bad.

Realize that humanity even today still continues to do these same things.

And oh yes, there is digipin for places in the Indian/Arabian oceans too.🌊

Wednesday, March 20, 2024

Dammit root-check (a yeoman bug saga)

TLDR; Discovered a bug in yeoman, and, thought it was a bug for a long time. Until I discovered it was a module called root-check. And the bug wasn't actually a bug!

This is a rant. My previous blog post, does add a little context here. But it is not "absolutely necessary" that people must read that one first before continuing here. This is a post about a bug I recently faced with yeoman.

For those in the dark, Yeoman is a program to scaffold projects. Mostly nodejs. But you are not limited to nodejs projects; you can scaffold any kind of project. And, you could make it do some simple tasks. So when I figured out how to make cloud-init vmdks, I thought it needed a bit of automation, and, went ahead and made a yeoman generator for the same.

So this involves writing a custom generator.

A generator in yeoman parlance, is an encapsulation of a folder structure, files (that can be templated), and, scaffolding logic. So when a generator is run in your working folder, it is usually for a specific purpose, like for e.g., a scaffold react app, or, an angular, or, a loopback 4 application, etc, the generator creates the necessary folder structure, assets, etc, (including adding of license texts), to get you quickly started with the respective development process.

The yeoman cli (command-line interface) is a nodejs program, that helps humans download generators from the internet (i.e. generators that meet their project needs), discover generators already downloaded/installed on their machine, and, executing them. There are other parts involved in the picture; together they lend themselves to be a kind of an ecosystem for scaffolding projects. This is in the same spirit of tools like Maven, or, NuGet.

I will narrate my recent yeoman experience. The goal is not to ridicule the project maintainers. If you asked someone what opensource software development looked like...this blog post might give you a perspective. Further, I am not imposing that the reader be familiar about the tool. However, I make no attempt at explaining specifics since I believe they are self-explainable.

The first hiccup was the cli and the generator-generator. Today, from npm you download the following:

yo@5.0.0
generator-generator@5.1.0

And, when you type yo generator (to scaffold a custom generator project), the cli errors! But, once you google enough, you will find that the simple fix for this problem is to downgrade yo, i.e. install yo@4.3.1. With this I was able to progress authoring my generator. But please note this as issue #1.

I knew what the generator should do. But when it specifically came to do a linux filesystem mount, things started to break. I didn't know why it failed! I ensured that I ran in an rooted terminal and all. I wrote some isolated tests and confirmed that there is actually no fault with code I wrote. And, yet its failure to work through a yeoman cli invocation escaped me. Make note of this as issue #2.

And, the next thing I did is, raised an issue on github. This issue post contains examples of what I am trying to accomplish, and, an isolated example which proved that the file mount was working even as expected when running via root. (You will also find a gist in that post).

There was an itch to "tell the world" first; I went around forums and asked people if they would react on the github issue. It is unethical to do this, but people do it anyway. However, my aim was to get other people somehow confirm that they were able to reproduce the bug, and then perhaps ask them nicely to react on that issue!

Those attempts didn't work anyway. So, there was no choice, but to read the source code. I wondered if this could be a bug with nodejs itself?! On linux?! Can I pride myself on discovering a nodejs bug?!! All the source code research did was help me make a better isolate test script. It modeled what happened inside the yeoman cli. And to my surprise even that test seemed to perform the file-mount; whereas when yeoman tried to run my generator, the linux file mount failed! I was flabbergasted. Here is the link to that isolated example: https://gist.github.com/deostroll/b69f6868c99f97bccb14bf1b848c7bbf#file-index-js

For a long time, I thought the issue could be #1. Am I working with outdated components?! I made the next decision to find out the updated components I have to use. But I couldn't find any in those standard official repos which npm pointed to. This made we wonder about OSS experience. Now I am really at the mercy of the maintainers, or I am on my own to fix the problem. Because, as of that moment, my issue on github, were merely bytes of data stored on github's database residing in a datacenter somewhere in the world. Would someone ever respond as to why the bug was so?

Lingering around, trying to find out what could be the updated component versions I could work with, I discover a few other facts. Many OSS in javascript and web development in general are in some kind of a movement to embrace new standards - like async/await, decorators, etc. Some of these standards are not formalized into the language itself. For e.g. decorators, is not yet imbibed into the ECMAScript standard yet. It is still in an experimental phase, however, because of typescript, developers can enjoy using them in their code bases. So, this is what is happening in our software landscape today - a kind of migration of code patterns.

Most OSSs, have nothing new to bring to the table for developers. But they would do this migration anyway for several reasons. Some of them do it well; that is their end-developers are not affected. Everything works like before for them. For others, not so much. I seem to be stuck in this branch of life. Yeoman is migrating. They are even namespacing their projects over at npm, in an effort to reinvent the wheel. This leaves developers like me in the shadow on how to fix things. But make a mental note of my actual position. I am not someone deeply involved with this project; I have not made any contributions. Nor do I go about doing code reviews, or respond to other issues on their github issues page. I am using the so-called software after 10 years, and I find an issue. I post it on their issues page with full hope that someone would quickly respond. And then, I realize about this great migration, and realize my bug may never get the response I am hoping for.

Ultimately what gave me the clue was the second version of an isolated test. If I plug my generator into my yeoman environment properly and run in an elevated (or rooted) terminal, my file mount succeeds. But, the same thing via the yeoman cli fails, still! At this moment, there was still no answer.

And then, one of the maintainers responded to my issue post. His response was that I was working with outdated components; they were more than 5 years old! I don't know why the maintainer actually avoided the issue. That is when I actually "read" what I posted. Compare the 1st isolated test and the 2nd isolated test. The second one was more explainable. Perhaps, there is a better probability that a maintainer would understand the underlying issue IF I had posted that one. I wondered to myself what was I smoking when I was writing the issue post like that. 🤔

So, what has ultimately made me figure this all out? I happened to capture the error code for the (linux) mount program, and googled it. It seems that this error only happens when the (linux) mount command runs as a non-root user. I am not running as a non-root user! Now does yeoman have a thing for running as root...? IT DOES. The cli program has a module called root-check and if invoked anywhere in code, and if the terminal is rooted, it downgrades the process to a non-root one. And in my case, there was no other indication of this other than the failure of the mount command!

The damn bug was actually a feature! 🤦‍♂️

A few minutes prior to finding out the answer to this problem, I came across this issue post on the repository of root-check module. It is titled rather comically. The OP expresses his astonishment/angst. And he also suggested that this module should have an environment variable to control or toggle the root-check behavior. And the maintainer also provided an apt reply. But somehow after all this experience, and, reading that OP post, I could understand his sentiment, and, wanted what he actually wanted.

Saturday, February 12, 2022

Copying files from host to container and vice versa

TLDR; You can use base64 encoding/decoding to copy files

Here is a simple trick to deal with the menace of getting files from the host to container or vice versa. This can even work with VMs running inside your host. We usually use ssh to connect to VMs or containers. When working with the former ssh itself has solutions (like scp) to do the same thing. But it can still be done using the technique I am about to explain.

So in most popular docker images (linux based) there will always be a base64 utility program. (Or it can be installed via any package manager utility like apt or yum). The same utility program exists on several other popular operating systems out-of-the-box. And even if it not there, (especially on Windows), there are other applications that provide the same utility program. For e.g. git bash command line.

The steps are simple. Say you have a file called hello.txt with some text.

1. Run the base64 utility to encode the file.
$ base64 hello.txt
2. Copy the output to your clipboard.
3. In your target container instance, in an exec session, you need to do the following:
$ echo <base64_text> | base64 --decode - > hello.txt

The caveats

1. The container image should be linux based, and, have a base64 utility installed. If not installed there must be a script-only solution...will share when I find it.
2. In this example, I based the base64 binary based on what I experienced. Some binaries may have a different set of arguments. Please consult the help docs

3. You need to be able to write on the container file system.

NB: #3 is the usual bummer. But if you are able to write on the container file system, why not just use exec sessions⁉️

So why does it work?

I do not have a solid answer. You can youtube about it to understand how the algo works. All I know is that the encoding process is based on 64 ascii characters. All the 64 characters can be typed using a simple english keyboard (US layout).

Incidentally base64 is also a very popular with data transmission over the internet. Because it is mostly ascii based. I learned this the hard way when I explored SMTP. protocol and servers

So any digital sequence of bytes can be encoded, transferred and, then reformed using base64. That is the underlying principle of why this works.

I conclude this post with an interesting exercise. Suppose there is a file with content inside it. This file also has some set of file permissions. How do you get a file across (to a container or elsewhere) with the same permission set?

Saturday, February 5, 2022

Cloud-init or userdata

TLDR; I explored the cloud-init/userdata. It is a time-saving feature, mostly used when creating VMs in the cloud. (It is perhaps available for other distros as well).

To start off I would have to say that cloud-init is mostly a concept/feature associated with ubuntu server distros. I have been installing these distros a lot and have seen a lot of console output with this term. But I never really understood much less appreciated its significance, until I explored cloud services such as AWS, DigitalOcean, etc. On cloud services, this feature is commonly referred to as "userdata". (Not all cloud services use that term, or probably have this feature).

As the name suggests it is meant to initialize your VM instance to the required state for you to go about your business - exploring or running software including:

1. Updating software
2. Installing favorite tools - e.g ifconfig in linux
3. Creating user account, setting up hostname, etc...

That is all you need to know. I will give you a quick info about how I came to experience this feature.

So I was exploring SSDP protocol. In short, it is a way programs can find other programs on the network. There are many use cases for this kind of feature in a program. Most commonly it forms an essential part of similar programs trying to decide among themselves who should be a leader or who should be a slave.

There was a python package which implemented this protocol - ssdpy. This is a python package; you have to install this manually. On ubuntu server (v20.04) python and pip doesn't come ready-out-of-the-box. You have to install them manually after running your apt updates/upgrades.

A developer exploring cloud services would normally find surprises such as the ones mentioned above. But after all that is done, you explore the ssdpy program. This program has CLI commands to start a SSDP server which can publish some service (usually on the same host machine), and, a discovery program that can discover this. You run the discovery program from a different VM in the same network.

However, SSDP doesn't work in the cloud. To test, I spun up two VMs, came up with the above conclusion and then quickly destroyed the VMs so I don't incur operational costs. But then, I thought about testing it with a different set of options.

So basically all those commands required to setup ssdpy needed to be run, on two VM instances. It seemed apt to use the "userdata" feature here. Further, along with this userdata initialization, I also downloaded some bash scripts from my github gist intended to send notifications to my mobile phone when the VM was "ready".

Final verdict about SSDP is still the same - SSDP doesn't work in the cloud. I am not going to answer why that is so...This post was a short intro into "userdata", or cloud-init.

Sunday, November 28, 2021

[Rant] about the "browser based" video editor - Mastershot.app

TLDR; all the video rendering happens on your own machine. So not really helpful if you have cheap hardware.

Mastershot is an online video editor. I am not promoting the use web app either. That is entirely up to you. This post isn't a critique about the experience the app lends to end-users. I myself haven't used the app so far. It requires a signup and everything. So I never bothered. Its completely browser-based, and, that is the actual problem.

So a bit of history: I was interested in a web-based video editor. I have frozen/crashed my machine a couple times trying to edit videos, and, then render them. I frequently run into these troubles since I use a computer with inferior hardware. I am still on the lookout for a solution however. I want it to be opensource and something I can setup on my own, and, run in my own home network.

So a cursory internet search lead me to Mastershot. I understand that it is a web-based video editor. (Or precisely a chrome-based video editor - chromium or google chrome). The author markets it as a "simple" editor where the video you upload, never leaves the browser. Further, he contrasts with other services (for e.g. like WeVideo) where they force you to bear concerned company's watermark logo on the video or something to this effect; his app does not indulge in such branding tactics.

You can pause here and check out the app, or the hackernews comments about this app, and try to understand the app. It is cool to witness how much browser technology has evolved - today we are able to do video editing in the web browser. I was mesmerized until I realized where I was coming from...

A critical experience related to video editing is the final thing - the video rendering. After all the hard work of editing, mixing audio, adding images/text/etc, you want to distribute the video in a format that your audience can play. Not many people realize that this is also a critical experience. You can term it as the post-editing experience for lack of a better word.

After getting your resolution, file formats, etc all correct, you have to wait for the final video to "render". And this is CPU intensive. Which was why, for long videos, my pc (with inferior hardware) always gave up responding to even a mouse moving across the screen. Mouse movements became sluggish.

So it finally means, for faster rendering, you need a pc with some superior hardware. It will definitely burn your pockets. You can buy laptops which have the processing power to do this. But, I just don't feel comfortable being "mobile" with an expensive piece of hardware. I rather invest in a low end laptop, and, offset all the rendering to a much powerful machine...like a dedicated server. There is also a privacy aspect to this hypothetical web-based solution I am hoping to obtain. But its completely for personal use inside of my personal network. So privacy isn't so much significant here.

Thus, the reason for why I started off writing this post. I realize that Mastershot is a clever app, but it does not help with rendering because it all happens in your browser on your machine which utilizes your own hardware for rendering. So if your hardware is not up to it, your computer mouse movements while rendering might be erratic.

Friday, October 8, 2021

Tip/trick for Visual Code logpoint debugging - nodejs

TLDR; you can decorate logpoint messages to include the file and line number. Useful for tracking the execution of nodejs code/module.

Add these lines of code to the beginning of the file:

You just need add it once. In case you are running multiple test files, include this code in the file that would execute the earliest. So for e.g., if you use mocha as your test runner, this piece of code could go in a require script (mocha --require test/init.js test/**/*.spec.js).

So the getTrace() is now available globally.

Next, add your logpoint messages in the way shown below:

Thats it!!! Now watch the magic unfold when you run your code in debugging mode, and, these logpoints hit. You can either view them in the Debug Console. (Even elsewhere depending on how you started your debugging session).