A very uninformative progress update: Mailpile 2?


#1

Hi everyone!

I posted a while back that I was planning to restart development this summer. I just wanted to inform y’all that I have done so, but it has proven to be a much more up-hill battle than I had hoped. I haven’t got much to show for my efforts yet.

My current main priority is to get the project back in a state where people could participate and contribute. The biggest blockers there, IMO, are the Python 3 issue and the fact that our “web framework” is crufty and confusing.

So I started working on Python 3 support, and realized that it’s quite hard - the syntax issues aren’t a big deal, but due to the way Python 2 and Python 3 differ in handling of strings vs. bytes, I need to review pretty much every line that manipulates data and decide which should be which (strings? bytes? what encoding?).

The effort involved is close enough to a rewrite, that I’m basically treating it as such, copying one source file over at a time, reading it and rewriting what needs to be rewritten. I’m not trying to resist the urge to clean things up in the process.

I’m doing this in “stealth mode” right now, so I guess this entire post is a tease about vaporware - I’m not showing my work yet, because I just want to focus on the tech stuff and I’m introvert enough that I find coordinating and explaining to be a bit of a burden. I’m also not sure what I am doing is even a good idea, so I’m going to give it a try for a couple of months and reserve the right to just throw it all out and return to the current Python 2 code.

So that’s where I’m at. We’ll see how it goes!


#2

Thank you for the effort! You will quickly find the typical patterns in your code, which makes this work easier.

But does’t it make more sense to merge/reject the PR first?


#3

I don’t know!

I haven’t figured out how much of my time I want to spend on the current Mailpile code, vs. the rewrite/port. Reading and formulating opinions about pull requests is work, deciding whether to then merge them into my new code is ALSO work.

I’ll be honest that I have had so much on my mind lately, that I have just been ignoring it. I know this sucks, but my personal life has only left me with very limited time for tech work and I just can’t do everything. I’m sorry.


#4

Oh, okay. Didn’t think that the rewrite would be that thorough. Then, try to focus on the rewrite, I would say. After that, we see which PR/issues still make sense.


#5

I’d like to volunteer some development time on Mailpile. I’ve been searching for a mail client that meets my needs for years and I think it’s just time that I try to help build one :slight_smile:

I was going to try to pick up some of the low hanging fruit issues from GitHub as a starting place. Is that a useful thing to do? Or should I wait for a v2 branch? Or is there something else that I could help out with?

Thanks for your work on this project, it looks awesome :slight_smile:


#6

Just checking in! Hello!

I have been doing a lot of work on this, and I have to say it is going very well.

I expect it will still be another month or two before I have something to show people, but I am laying some very sold foundations and porting code over as I go.

There are some fundamental philosophical/design changes I can tell you all about though, in the meantime!

Mailpile 1 preloaded a lot of data into RAM to accelerate search results. This worked as intended - my Mailpile (v1) now has over 1 million messages on file, but results are still fast. This had serious downsides though. In particular, the startup time for the app goes up with time, increasing linearly as more mail is processed. It now takes me around 10 minutes to restart my Mailpile, which is frankly unacceptable. The other downside is RAM usage; most of those million messages I really don’t care about and Mailpile uses gigabytes of RAM to allow fast access to searches involving them. This isn’t the worst tradeoff in the world, but it’s also not the best.

I am taking a different approach to the metadata index in Moggie. Moggie no longer loads the data directly into RAM on startup, instead it just loads some compact indexes and mmap()s the data files. Startup becomes almost instantaneous, and the plan is to rely on the operating system kernel to cache frequently used metadata in RAM, instead of doing so ourselves. The data structure itself is designed to facilitate this: recently received mail and old mail will (over time) occupy different files and different regions of disk, allowing the kernel to more easily cache the data we are most likely to care about and ignore the rest. We waste some disk space to facilitate fast in-place edits/updates, but make up for it by using a tighter encoding scheme than Mailpile did.

Overall this will make Moggie’s search performance a little more erratic (especially if people are using spinning-rust storage), but we should make up for it by not wasting CPU and RAM on data we don’t intend to use. The code itself is also smaller, Moggie just does less work in Python space, offloading more to the OS kernel. Which is also good for performance.

I’ve been having a lot of fun working on this.

I was testing the code last night, and combining the new metadata index with some carefully optimized mbox loading code I also wrote recently, Moggie can build a fresh new metadata index from scratch many times faster than Mailpile 1 could merely load its pre-calculated index from disk.

All of which is to say, progress is being made, and I am becoming more confident that my decision to start from scratch and use Mailpile 1 as a source of code snippets I can use or discard as I like, is feeling very good right now.


#7

Oh yeah, the code-name for Mailpile 2, is Moggie. Which is homage to the mutt mail client, which I see as inspiration for how light and fast I’d like the app to feel. A mutt is a mixed-breed dog, a moggie is a mixed breed cat. :smile: