On this page

Writing an importer

If the included importers are not enough for you, you’ll want to find other ones or write your own.

Unfortunately, the Docker-packaged version doesn’t provide the easy way to include external importers (yet? If you need this, let’s try to figure out how to simplify this use-case). Probably the easiest way would be to go through a standalone setup on your machine (via conda) or go through a dev setup and make necessary changes and build your own Docker container.

Including existing importers

Two primary frameworks for which you’ll find pre-written importers are ingest and beangulp. They are quite similar (with some differences in the API), with beangulp being a bit more modern. Chances are you can adapt existing importer based on these to work with beancount-import (using forked repo) as is done in Evernight/beancount-importers.

Modifying existing importers

As usually, the base guide can be found in the official Beancount docs.

You may want to modify an importer from https://github.com/Evernight/beancount-importers if you want to classify more transactions automatically. See commented sections in https://github.com/Evernight/beancount-importers/blob/main/src/beancount_importers/import_wise.py#L34 for example.

Also https://github.com/Evernight/beancount-importers/blob/main/src/beancount_importers/bank_classifier.py contains reverse mapping from payee to an expense category. All the items added there will result in the automatic categorization of all transactions from the merchant.

There is also TRANSACTIONS_CLASSIFIED_BY_ID mapping in the import*.py files: https://github.com/Evernight/beancount-importers/blob/main/src/beancount_importers/import_wise.py#L14 but these become irrelevant if you’re using beancount-import (just modify category manually after importing instead and rely on machine learning classifier afterwards).

Copying an importer

If your bank is not present, but provides statements in a similar csv file, you can copy an existing importer (e.g. import_wise.py) and start debugging. I’d recommend running importer directly like

      python3 -m beancount_importers.import_wise extract <csv_file>

on a small csv file you download.

Maybe you will only need to change the mapping of columns here: https://github.com/Evernight/beancount-importers/blob/main/src/beancount_importers/import_wise.py#L97, maybe you will need changes in categorizer function as well.

Some banks already pre-categorize transactions, you could map them to your Expenses categories: https://github.com/Evernight/beancount-importers/blob/main/src/beancount_importers/import_monzo.py#L13 and it will be a good starting point.

At the moment there is some duplication as csv.CSVImporter and IngestImporter provide setup for two very similar but different libraries (beangulp and beancount.ingest correspondingly) for parsing csv into beancount format. beangulp is more modern version and should normally be preferred.

When you get satisfactory results running your importer directly, add an IngestImporter and a corresponding configuration(s) into https://github.com/Evernight/beancount-importers/blob/main/src/beancount_importers/beancount_import_run.py#L22 so it can be configured and loaded via YAML config.

Once your importer is ready and tested, why not contribute it to the repository?

Writing an importer

If the provided libraries and boilerplate still don’t work, you may consider something completely different.

This guide https://reds-rants.netlify.app/personal-finance/the-five-minute-ledger-update/ describes various tradeoffs and recommendations on how to handle them in more detail. The author also has a code that represents results of his approach: https://github.com/redstreet/beancount_reds_importers. I have not tried this framework out yet though.

Generally, you may choose to use any set of tools and file structure that you want.

Existing code

The recommended ways to write importers recently has been using https://github.com/beancount/beangulp (and before it was beancount-ingest, and you can still find importers based on both).

Some other projects exist that provide ready-made importers: https://github.com/OSadovy/uabean/, https://github.com/tarioch/beancounttools (https://tariochbctools.readthedocs.io/en/stable/importers.htm), https://github.com/luciano-fiandesio/beanborg, https://github.com/beancount/beanbuff or even more here. Some use libraries provided, some use more idiosyncratic approaches. Later in the guide we will use beancount-import, also beanhub-import is more recent and looks interesting.

Of course, you don’t have to use anything at all. You can also write a code that takes data and spits out a Beancount file, it’s pretty straightforward to do. But these libraries provide helpers to organize and generalize this process so you have to write less code and reuse more of the existing.

Evernight/beancount-importers

I have not found already written importers for the banks I used at the moment so I wrote my own: https://github.com/Evernight/beancount-importers.git. All the banks provide statements available in CSV format, and together with existing libraries all you need is to define some config in code and create a wrapper to launch it.

This will describe how to use and/or debug them directly, without the beancount-import shim.

If you haven’t already downloaded the importers, clone the repository:

      git clone https://github.com/Evernight/beancount-importers.git

Using conda or vitrualenv will be helpful. You’ll need to install importers locally as in:

      cd beancount-importers
    pip install -e .

There are import_* files corresponding to each bank. To get a Beancount file all you would need is to download the statement from the bank, run the script like this:

      python3 -m beancount_importers.import_wise extract <csv_file> > wise.bean

And make sure the resulting file is included in the main ledger. You could use wildcards like in https://github.com/Evernight/lazy-beancount/blob/main/example_data/main.bean#L60 to avoid having to add each parsed file manually.

Now, all you need is to design a structure to store input csv files, another one to store resulting bean files, and a process (or even a script) to convert former into the latter. Also make sure that transactions don’t duplicate (so the csv statements don’t need to intersect). I would not recommend to edit resulting bean files manually as it will make system harder to maintain. Instead, try to classify everything with the code and re-run it on the old files as well.

For a while I’ve been using just something like this. It’s convenient to debug changes in the importer code, especially the ones that you would want later to apply to the older transactions.

Beancount-import

However, there are a few problems with the previous approach. The primary one is that you can’t manually edit resulting bean files and adjust each transaction. Well, nothing actually stops you, but the resulting process would become rather fragile. What if you made a mistake in the parser script and would like to reapply fix to the older transactions as well?

Fortunately, there is a library that allows to make the import process even more convenient: https://github.com/jbms/beancount-import. It provides a matching mechanism for input files vs resulting Beancount transactions, the additional automatic classifier for transactions, and a UI to preview and adjust your imports.

For the banks supported in https://github.com/Evernight/beancount-importers I’ve already written a wrapper for you.

To use the system, download all csv files in beancount_import_data. I prefer to split it by accounts, so put it in appropriate folders, e.g. beancount_import_data/monzo, beancount_import_data/revolut_eur, etc.

After you’ve done it, just run the beancount_import_run.py script:

      PYTHONPATH=.:beangulp python3 beancount-importers/beancount_import_run.py --output_dir beancount_import_beans

Go to http://localhost:8101/ and you will get into beancount-import interface that will show you transactions to be imported one by one. For each, you can either add it or decide to put into ignored.bean (for example, if it’s a duplicate of another transaction). Make sure the beancount_import_beans folder is included in your main.bean file.

This way you can import all the transactions for the source. If you stop and restart the beancount-import server using beancount-importers/beancount_import_run.py and if you have set up everything correctly, it won’t show the transactions for files already imported.

PDF

If your bank only provides statements in the PDF format, I’ve had success earlier parsing PDF files with https://github.com/chrismattmann/tika-python. Also check out this section. Once you convert the statement to CSV (explicitly or in-memory), the rest of the process should remain pretty much the same.

If you’re less picky about privacy

If you’re not particularly concerned about privacy of all (or some) of your transactions and are ok with some centralisation of that data and giving trust to 3rd-party services, there are options to make transaction import easier, using less configuration and manual effort. Examples are BeanHub’s Direct Connect using Plaid or gocardless-to-csv using GoCardless aggregator.

Debugging errors

Summary

Writing an importer

Including existing importers link

Modifying existing importers link

Copying an importer link

Writing an importer link

Existing code link

Evernight/beancount-importers link

Beancount-import link

PDF link

If you’re less picky about privacy link

Including existing importers

Modifying existing importers

Copying an importer

Writing an importer

Existing code

Evernight/beancount-importers

Beancount-import

PDF

If you’re less picky about privacy