Apache FOP and document properties

I am a little bit obsessive about checking out the document properties in PDF files I read. I can’t explain why, but there you have it.

I was sad when I noticed the PDF file being emitted by my XML has no document properties. So I figured, no problem, I can just go find the right FO tags, grep for them in the DocBook XSL and reverse engineer what stuff I need to put into my DocBook to get them set right. I figured it would be something obscure like, <author> or something. In fact, DocBook already knows who the author is, in order to format the title page nicely, so no dice there.

What I found is this page on the Apache FOP faq that explains that FOP can’t do it. WTF? This can’t be that hard, and it really seems like a natural thing that would make it into version 1.0. (Of course, FOP is version 0.94, which might explain something as well.) In their defense, it seems like this is also braindeadness in the FO spec, beacuse I found a commercial implementation of FO that says you have to resort to special extension namespaces to specify metadata in their implementation. But Apache FOP already has the “fox” namespace for this purpose, so no big deal.

The solution is to write your own Java program (real user friendly there, guys) that uses a nifty PDF library called iText to add on the metadata in a post processing step.

In case someone else needs it, here’s what I came up with:

/* based on example here:

import java.io.FileOutputStream;
import java.util.HashMap;

import com.lowagie.text.pdf.PdfReader;
import com.lowagie.text.pdf.PdfStamper;

public class Stamp {
    public static void main(String[] args) {
        try {
            if (args.length != 6) {
                System.out.println("Arguments expected:");
                System.out.println("  pdf-in pdf-out title subject author keywords");
            } else {
                // we create a reader for a certain document
                PdfReader reader = new PdfReader(args[0]);

                // we create a stamper that will copy the document to a new file
                PdfStamper stamp =
                    new PdfStamper(reader,
                                   new FileOutputStream(args[1]));

                // adding the metadata
                HashMap moreInfo = new HashMap();
                moreInfo.put("Title", args[2]);
                moreInfo.put("Subject", args[3]);
                moreInfo.put("Author", args[4]);
                moreInfo.put("Keywords", args[5]);

                // closing PdfStamper will generate the new PDF file
        catch (Exception de) {

The <xen> of <xslt>

For a project I am doing right now, I descended into DocBook hell. Not completely unscathed, I made it through the learning curve (why don’t they call it what it is: The Unfathomable and Horrific Tunnel of Learning) and blinked slowly in the light of day.

I realized DocBook is nice, but it’s not actually what I wanted. Doh.

What I wanted was a structured way to represent my data, and I want two things to happen to it. Today, I want to publish my data with DocBook. Tomorrow I want someone to be able to suck the brains out of my DocBook document (leaving it to wander the earth as a zombie) and to put them into a wiki so that my project can become a community-maintained database, instead of being a single DocBook-formatted document that lives in my Subversion repository.

The way to do that is to step back from DocBook as a primary source, and instead generate DocBook from my data. I organize my original data into XML, then reformat it with XSLT from my data structure into DocBook, then DocBook reformats it into XML-FO or HTML, and those become readable documents.

To other people contemplating the same idea, I’d say “go for it”. But beware the startup cost is huge. The results are nice. Here are the things you’ll need to learn:

  • How to use your XML editor (Emacs in nXML-mode for me). You cannot just limp along with Notepad or vi. Won’t work, don’t try it.
  • How to write a schema for your new data structure. There are like 12 schema formats to choose from, but I chose Relax NG Compact Syntax because nXML-mode prefers it. Don’t use DTD or else your brain will melt. SGML = Bad. Everything post-SGML = slightly less bad. Farther from SGML is better (thus Relax NG compact = best).
  • How to load your schema into your editor. (Tip: C-C C-S C-F for nXML)
  • XSLT (which is the ugliest, stupidest, most verbose language ever foisted by computer science on users)
  • How to use XInclude to build up XML documents from pieces. Don’t skimp on this. Figure it out and use it, because the alternative is the supreme ugliness that is SGML external entities. Remember: SGML = bad, XML = slightly less bad.
  • How to make your XSLT processor work (and which of the 12 to choose — go with xsltproc, it’s super fast and stable, and doesn’t care which exact point release of Java you have. Remember: C good. Java bad. Write once, test everywhere…)

Don’t try to do it without a schema. You need to be 100% sure your data is in the right format before you go too far hacking on the XSLT, or else you’ll get confused and sad. Just suck it up and get the schema right, so that your editor can whack you with a clue-by-four before XSLT starts wasting your time going off into tag soup never-never land.

Trafigura’s West African dumping

Here’s an interesting story, well told, about an industrial process that takes refinery waste from the United States (derived from high-sulfur Mexican crude oil), cycles it through Europe, then dumps the result in West Africa.

The company running this racket (or “innovative commodity exchange”, as they call it) is Trafigura.

Learn more here:

Here’s a quick summary:

  • An arbitrage opportunity exists for energy traders based on differing regulatory frameworks in rich countries and poor ones.
  • A chemical process can turn a waste product in one jurisdiction into two outputs, gasoline usable in a lenient jurisdiction (West Africa), and the waste extracted from the original product.
  • If you buy one ship of high quality gas, you can dissolve the waste stream from several other tankers of coke gasoline into it, meaning that you can dispose of the waste stream by getting your customers to burn it for you in their cars.
  • A clever and immoral company can take advantage to squeeze profits where others just saw costs. The profits come from the externalities of burning high-sulfur gasoline (decreased longevity due to sulfur-rich smog)
  • None of this is precisely against the law. Tanfigura and it’s contractors made minor infractions here and there, playing fast and loose with the rules. But what they are doing is fundamentally not illegal — though it should be.
  • Trafigura was working towards, or achieved, the ability to reprocess this stuff at sea, likely to further reduce the power of regulators over their work.

How much other stuff like this is going on? Who are the people that organize and operate this kind of thing? How do they sleep at night?

Gates Foundation vs the Lancet

The Lancet has published an academic paper analyzing the deployment of funds at the Gates Foundation against a backdrop of the actual burden of disease. The bottom line is the Gates Foundation does not come out looking too good, seemingly interested in whizbang gadgets and not in focusing on the job at hand. Another really interesting and sad note was the extent to which being nearby the Gates Foundation, geographically or culturally gets you in the money. PATH and the University of Washington raked in the cash. African researchers? Not so much.

At the same time, the Lancet published an editorial and a commentary. Of course, being academics, you know the knives are going to come out and some serious backstabbing is going to happen. (“They fight so hard because the stakes are so low.” Sigh.) They saved the really rude things for the editorial, a particularly cowardly form of academic infighting, as that way no one has to put their name on the insults. At least the commentary is signed, though in keeping with the fact they will be held accountable for their words, they are much more restrained.

The thing that most pissed me off about the Lancet’s editorial is the stuff about transparency. They whine and moan about how the Gates Foundation didn’t come ask them what to do. You know what, all you Masters of the Public Health Universe? You had your chance. You wasted 100 fucking years, and things just got worse and worse. Some of you were wanking, writing useless papers. Some of you were too busy teaching the next generation of wankers to go out and find out what its like to be poor. The rest of you were on public health tourism packages, in business class and five star hotels. There are no poor people in the Addis Ababa Sheraton… except the waiters, but you don’t notice them anyway, I suppose.

If the Gates Foundation wants to know what works, the only way to know is to go ask the people doing it, those laboring in obscurity in tiny, underfunded local NGOs, and those laboring in sweaty, dirty, dangerous, uncomfortable places with overfunded and overexposed NGOs like MSF.

As for the commentary, it’s major point (made three times over, according to my underlines on the copy I read on the bus) is that the Gates Foundation should be investing in putting into practice things that we already know work, instead of whizbang things for the future. The whizzy MPH speak for this is “service delivery”: i.e. making sure the pharmacy wasn’t cleaned out by thieves the night before the sick baby arrives in the ambulance that someone remembered to put fuel into.

I would be amenable to this argument, except that we already know why service delivery is so bad. It’s because a few people in this world are corrupt assholes, and something is wrong with the cultures where service delivery is bad that lets the corrupt assholes ruin things for those who just want to be healthy. The fact that people are corrupt assholes is not a problem. England has plenty of corrupt assholes (in fact, they seem to be in charge of the parliament here), but the NHS keeps running anyway. In Switzerland every year there are probably two or three doctors who lose their license for insurance fraud, billing for stuff they didn’t do. What’s the difference between the corrupt assholes in Switzerland and places where service delivery is failing patients? It’s good governance and accountability.

Even a short little career in humanitarian aid like I have had can make you cynical, and I’ll freely admit I am cynical. But I see hope everywhere I look, too. Good people trying to make their health system work get torn down by the system, and the system is made of a thousand corrupt assholes, from big corruption (the Minister of Health of Uganda for example) all the way down to little corruption (the numerous minor staff problems we faced every single week in Saclepea, Liberia).

The answer is that people won’t be healthy until they and their neighbors take responsibility to make a health system that works. It doesn’t matter how much the Lancet whines to the Gates Foundation, and it doesn’t really matter what the Gates Foundation invests in anyway. The demand for healthy communities needs to come from educated, organized, and disciplined communities. Whatever helps get us there, we should invest in. Whatever is unrelated to that is a distraction and not an ethical use of time and money.

Just DO It

There’s a guy in Philly who realized he’s rich, because me makes $30,000 as a community organizer, and many people in the world live on far less. He decided to go on a diet, for his own helath, but also to understand what it means to live on a simple diet.

That’s already an interesting story, but what’s really interesting is this video he posted, where he talks about what held him back from starting the project (fear) and what the reality has been (support). This is exactly the same thing I found when I made the big change to leave IT and be a log for MSF.

To anyone else out there, and I know some of you are out there because you come to me for advice, here’s the cornerstone of all advice: Just do it. What do you care what other people think? Why do you think your guess about what they will think is right anyway? Just do it, and then be surprised by the reactions, and by the good that comes into your life as the payoff for the risk you took.

Community Finance in West Africa

Vasco Pyjama talks about community finance. The ROSCA is known as a “sou-sou” in west Africa, or at least in Nimba county, Liberia. Sousous run for a fixed term, based on the number of members. If there are 10 members and the contribution is $10, each month one of the members will get $90 (9 other members * $10 each). At the end of ten months, the sousou can either be restarted, or the membership can be renegotiated (for example, to drop people who failed to pay on time during the past sousou period). If the sousou is reconstituted with more or fewer people, it doesn’t really change anything, it just runs shorter or longer until the next restart. Sousous run best when they are between 6 and 12 people for social and economic reasons (a 5x – 11x payoff is manageable in a cash society). The order of the payouts is determined randomly at the startup meeting of the sousou. Sometimes people negotiate to trade their places in the payout order in order to assure that the sousou payout would arrive at a time that was convenient for them.

One thing Vasco forgot to mention about it is the very powerful social aspect to village finance. In a village environment, saving face is important. Failing to be a reliable member in a sousou is a pretty embarrassing. But the worse punishment comes when people are invited to the next sousou, and you are excluded. People who are trustworthy get the benefits of a community savings scheme, and those who don’t are excluded. Tough, but transparent. Those excluded from a sousou this time might get another chance with another sousou later. The basic fabric of village life means that everyone has a chance at redemption.

The proceeds of social investment in Nimba are used for things like concrete floors and new roofs. Some richer guys planned on buying a car by combining savings from their salary over the year and their sousou windfall. Then they paid their brother to drive it, making a small foundation of a family taxi company. Others would arrange for their sousou windfall to come in the same month as their vacation, so they could use the money to buy cement and work full time on a second house. After two or three years of this trick, using sousou windfalls every six months or so to advance construction, they’d be landlords — and that’s the real secret to getting rich in Liberia. (To be clear: landlords benefit from a transfer of wealth, they do not actually make new wealth by producing something. So it’s relative prosperity, not global GDP growth.)

There’s are some added systems that some sousous I learned about used. One had a loan concept, where members who wanted to be moved to the head of the line had to pay higher amounts in ($12 instead of $10) with the “interest” being accumulated into a long-term reserve pool. The reserve pool would then be paid out at the dissolution of the sousou to the non-borrowing members. Another sousou, this one with a large reserve after years of continuous operation, included a “compulsory loan month” in August. Each member was compelled to take a loan in August and to spend it in the local economy. He had to pay back this loan, and his normal sousou contribution during the rest of the year. They explained to me that the compulsory loan was intended to create demand in the local marketplace at a time when the local shopkeepers were normally seeing lower than average sales. Another sousou arranged their payout schedule to defer some of the normal payouts until December, to help buy Christmas presents.

As an aside… the business of a eastern Congolese motoman is interesting as well. I don’t remember the figures anymore, but it was a very tidy little business. I think the motocycle buyer could get his investment 100% paid off in 9 months by renting the moto to a motoman, and then for the remaining 3 year life of the motorcycle, he made $25 a month “moto rent”, and the motoman (usually a younger relative) got a steady job driving the motorcycle. The only problem was competition. In east congo, the business case for motos is blindingly obvious, and the right conditions exist: reliable supply of cheap chinese motos and parts from the port in Mombasa, capital ready to invest, and roads that are hostile to more comfortable cars. So there are 10 motos for every customer. The price does not collapse, because it tracks the running costs (gasoline) closely. Instead the weekly wage of a moto driver collapses, because the moto owners still demand their rent, no matter how bad competition gets.

Learning to Love Social Welfare

When you fall in love with someone from another country, it just happens. Then what comes after that takes more effort. You have to learn to love the other culture you’ve thrown your lot in with. That process is, and will continue to be, a joy for Marina and me.

Here’s an article that describes some of what I’ve learned about how life is organized in Europe. I followed the same course, roughly, as the author. Though I have to admit, I was never as skeptical as he was. Perhaps that’s because I was given humanist values by my mother, and I always understood that something wasn’t working right in my homeland.

Another great example is something I told Mari the other day. She said something about a dream for the future. I told her, “Congratulations! A little-known benefit of marrying an American is that you are entitled to the American dream. As long as we are together, if you can dream it, we will work to make it happen.”

This is not fluff-speak. The problem of “self advancement” is a serious one for Europeans, and particularly for Swiss, who see themselves as coming from very small cantons deep in the middle of a very small country (the “we’re just a bunch of farmers” effect). There are some Swiss people who look down on those who seek to improve themselves. There are some who criticize their neighbors when they travel too far, or study too much. Entrepreneurialism is much more rare in Europe. The most active, engaged, and entrepreneurial Europeans I have ever met live in Silicon Valley. They have to escape, they can’t imagine a life stuck in the rank and file of Siemens biding their time until they get a turn to be a boss and have a tiny bit of control.

At the same time I’m coming to understand what real social welfare is, Marina is coming to understand what the freedom to dream is, and what it feels like when any dream (big or small) is met with “Great! Let’s go do it!”